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Abstract 


Multirate  systems,  which  find  application  in  the  design  and  analysis  of  filter  banks, 
are  demonstrated  to  also  be  useful  as  a  computational  paradigm.  It  is  shown  that  any  prob¬ 
lem  which  can  be  expressed  as  a  set  of  vector-vector,  matrix-vector  or  matrix-matrix  oper¬ 
ations  can  be  recast  using  multirate.  This  means  all  of  numerical  linear  algebra  can  be  re¬ 
cast  using  multirate  as  the  underlying  computational  paradigm.  By  viewing  multirate  as  a 
computational  paradigm,  many  problems  found  in  signal  processing  can  also  be  reformu¬ 
lated  into  fast  parallel  algorithms.  For  example,  this  paradigm  is  applied  in  a  straight  for¬ 
ward  fashion  to  the  Fast  Fourier  Transform  (FFT)  and  the  Discrete  Hartley  Transform 
(DHT)  to  create  fast  parallel,  or  multirate,  versions  of  these  algorithms. 

As  a  non-trivial  example,  the  multirate  computational  paradigm  is  applied  to  the 
problem  of  Generalized  Discrete  Time-Frequency  Distributions  (GDTFD)  to  create  a  new 
family  of  fast  algorithms  for  the  calculation  of  Time-Frequency  Distributions  (TFD).  The 
first  result  of  the  application  of  multirate  as  a  computational  paradigm  to  GDTFD’s  is  a 
new  class  of  distributions  called  the  Decimated  GDTFD  (D-GDTFD).  These  distributions, 
which  are  based  upon  the  Zak  transform,  trade  bandwidth  for  speed.  For  a  decimation  fac¬ 
tor  of  m,  there  is  an  m  fold  increase  in  throughput  (or  speed  of  calculation).  The  corre¬ 
sponding  reduction  in  discrete  bandwidth  is  from  27C  for  the  GDTFD  to  2n/m  for  the  D- 
GDTFD.  An  important  attribute  of  the  D-GDTFD  is  that  it  requires  significantly  less  stor¬ 
age  than  the  GDTFD.  The  D-GDTFD  requires  only  1/m^  of  the  storage  of  the  GDTFD. 

By  combining  several  D-GDTFD’s,  it  is  possible  to  reconstruct  a  GDTFD.  This  re¬ 
construction  of  D-GDTFD’s  is  the  Multirate  Time-Frequency  Distribution  (MRTFD). 
Each  D-GDTFD  is  independent,  and  as  a  result,  the  MRTFD  can  easily  be  implemented  in 
parallel  for  an  increase  in  throughput  on  the  order  of  m.  If  additional  parallel  paths  are 


available,  the  individual  D-GDTFD’s  can  also  be  implemented  in  parallel  leading  to  im¬ 
provements  in  throughput  on  the  order  of  or  more. 

Two  distinct  MRTFD  algorithms  are  presented.  The  first  MRTFD  is  based  upon  the 
inner  product  form  of  the  GDTFD  and  combines  the  Zak  transform,  weighted  spectro¬ 
grams  and  Singular  Value  Decomposition  (SVD).  It  is  called  the  SVD  MRTFD  and  calcu¬ 
lates  the  distribution  for  particular  instants  of  time.  The  second  MRTFD  is  based  upon  the 
outer  product  form  of  the  GDTFD  and  is  called  the  Circular  Convolution  MRTFD.  It  also 
builds  upon  the  Zak  transform  and  calculates  the  distribution  for  blocks  of  time  instead  of 
isolated  instants. 

Three  kernel  design  techniques  are  also  developed  to  allow  the  movement  of  alias- 
free  kernels  between  the  ambiguity  and  time-lag  domains.  This  allows  the  use  of  any  ker¬ 
nel  defined  in  either  domain  in  continuous  or  discrete  form  in  a  D-GDTFD  or  MRTFD. 
These  techniques  can  also  be  used  with  the  GDTFD. 
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An  important  part  of  understanding  a  nonstationary  signal  is  being  able  to  character¬ 


ize  its  changing  frequency  content  as  a  function  of  time.  Important  examples  of  this  are 
found  in  areas  such  as  speech  recognition  and  analysis,  machine  fault  detection,  sonar, 
radar,  spread  spectrum  and  low  probability  of  intercept  communications,  etc.  The  field  of 
Time-Frequency  Analysis  (TFA)  is  dedicated  to  the  task  of  determining  this  aspect  of  sig¬ 
nal  behavior.  This  dissertation  introduces  several  new  TFA  tools. 

1.1.  Historical  Background 

Time-Frequency  could  be  said  to  have  started  in  1946  when  Koenig,  Dunn  and  Lacy 
published  “The  Sound  Spectrograph”  which  presented  a  method  to  examine  changing 
spectral  content  of  speech  as  a  function  of  time  [31].  In  1948,  Ville  [S3]  showed  that  joint 
Time-Frequency  Distributions  (TFD)  in  signal  processing  could  be  created  based  upon  a 
bilinear  form  of  a  one  dimensional  signal.  It  turns  out  that  this  result  is  mathematically 
equivalent  to  Wigner’s  distribution  which  was  originally  developed  as  Position-Momen¬ 
tum  representation  in  quantum  mechanics  [55].  Many  efforts  were  subsequently  made  to 
create  TFA  tools.  Each  tool  had  its  advantages  and  disadvantages,  and  they  were  each 
thought  to  be  unique  and  unrelated.  Then,  in  1966,  Leon  Cohen  unified  the  field  of  TFA  by 
introducing  the  Generalized  Hme-Frequency  Distribution  (GTFD)  [17].  He  showed  that 
all  the  previous  TFA  tools  (or  distributions  as  they  are  called)  were  in  fact  members  of  the 
same  family  of  equations  related  by  a  variable  kernel  function. 

Unfortunately,  the  GTFD  generates  distortions  between  the  “correct”  terms.  Their 
removal  while  maintaining  the  other  desired  properties  of  the  TFD  is  largely  dependent 
upon  the  correct  selection  of  the  kernel  function.  Since  the  kernel  function  profoundly 


influences  the  behavior  of  the  cross-terms  in  TFD’s  and  defines  the  other  characteristics  of 
the  TFD,  proper  selection  of  the  kernel  is  paramount.  As  such,  kernel  design  has  been  the 
main  focus  of  Time-Frequency  Distribution  research.  It  can  be  argued  that  the  purpose  of 
this  sub-area  of  research  is  to  improve  the  accuracy  of  Time-Frequency  Distributions. 

Another  important  sub-area  of  research  is  to  improve  the  performance  of  TFD’s. 
Using  traditional  techniques,  solution  of  the  Generalized  Discrete  Time-Frequency  Distri¬ 
bution  (GDTFD)  is  an  0{N^  log  N)  process  which  can  make  it  impractical  for  many  real 
time  applications  [9].  Except  for  one  recent  paper  by  Cunningham  and  Williams  [19],  fast 
algorithms  to  calculate  the  GDTFD  have  been  largely  overlooked.  The  goal  of  this 
research  is  to  rectify  this  shortcoming  by  developing  fast  algorithms  to  calculate  the 
GDTFD. 

1.2.  Problem  Statement  and  Scope 

The  problem  answered  in  this  dissertation  is;  Is  there  a  systematic  means  to  deter¬ 
mine  a  fast  method  (or  methods)  for  calculating  the  GDTFD?  If  so,  what  is  it,  and  how  can 
it  be  used  to  implement  a  fast  GDTFD?  The  ideas  found  in  multirate  provide  a  basis  for 
the  approach  to  solve  this  problem.  It  furnishes  a  powerful  divide  and  conquer  formalism 
which  can  incorporate  the  Time-Frequency  Distribution  problem  as  a  particular  applica¬ 
tion.  Multirate  provides  die  scope  for  this  problem  and  is,  in  fact,  the  systematic  means 
that  is  sought. 

1.3.  General  Approach 

The  approach  to  fast  algorithms  taken  in  this  dissertation  is  to  use  multirate  as  an 
underlying  computational  paradigm.  This  is  a  new  computational  paradigm.  Instead  of 
using  multirate  as  a  tool  to  design  and  implement  filter  banks  only,  multirate  is  used  as  a 
powerful  tool  to  design  and  implement,  in  parallel,  divide  and  conquer  algorithms.  While 
the  focus  of  this  dissertation  is  to  develop  new  TFA  tools,  it  is  shown  that  this  computa- 
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tional  paradigm  can  be  applied  to  any  problem  which  is  composed  of  a  set  of  vector-vec¬ 
tor,  matrix-vector  or  matrix-matrix  operations.  This  effectively  encompasses  all  of 
numerical  linear  algebra  and  much  of  signal  processing. 

The  first  step  in  developing  a  fast  GDTFD  algorithm  is  to  demonstrate  that  the  new 
paradigm  is  useful  for  commonly  used,  rather  straightforward,  signal  processing  algo¬ 
rithms — the  Fast  Fourier  Transform  (FFT)  and  the  Discrete  Hartly  Transform  (DHT).  The 
remainder  of  the  dissertation  applies  this  new  computational  paradigm  to  a  non-trivial 
{^plication — the  calculation  of  Generalized  Discrete  Time-Frequency  Distributions. 

The  successful  development  of  the  multirate  FFT  and  DHT  suggest  that  multirate 
can  be  extended  to  other,  more  complicated,  algorithms  such  as  the  GDTFD,  but,  before 
developing  a  Multirate  Time-Frequency  Distribution  (MRTFD),  it  is  first  necessary  to 
extend  the  state-of-the-art  in  kernel  design  techniques.  Three  new  methods  to  move  ker¬ 
nels  between  the  ambiguity  and  time-lag  domains  are  developed.  This  allows  the  easy  cal¬ 
culation  or  modification  of  kernels  for  use  with  the  MRTFD  regardless  of  the  domain  in 
which  the  kernel  was  designed. 

Utilizing  the  new  kernel  design  techniques,  together  with  the  Zak  transform  and  the 
idea  of  weighted  spectrograms,  permits  the  creation  of  a  new  class  of  distributions  called 
the  Decimated  Generalized  Discrete  Time-Frequency  Distribution  (D-GDTFD).  The  D- 
GDTFD  forms  the  basic  building  block  for  the  MRTFD.  Each  of  these  building  blocks 
requires  less  storage  and  can  be  implemented  much  faster  than  the  GDTFD.  By  combining 
D-GDTFD’s,  it  is  possible  to  create  a  MRTFD  which  is  exactly  equivalent  to  the  GDTFD 
but  can  be  calculated  in  parallel  and  in  significantly  less  time. 

1.4.  Organization 

In  Chapter  2,  background  information  on  Time-Frequency  Distributions,  multirate 
and  other  necessary  subject  areas  is  given.  The  new  computational  paradigm  is  introduced 
in  Chapter  3  along  with  the  multirate  vector-vector,  matrix-vector  and  matrix-matrix  oper- 
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ations  as  well  as  the  multirate  FFT  and  DHT.  In  Chapter  4,  the  new  tools  for  kernel  design 
are  presented,  and  the  Zak  transform  and  its  connection  to  the  D-GDTFD  is  developed  in 
Chapter  5.  With  these  tools,  the  Multirate  Time-Frequency  Distribution  is  developed  and 
presented  in  Chapter  6.  Finally,  in  Chapter  7,  the  summary  and  recommendations  are 
given. 
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2.  Background 


2.1.  Introduction 

An  overview  consisting  of  definitions  and  properties  of  Time-Frequency  Distribu¬ 
tions,  multirate,  parallel  algorithms,  tensor  notation  and  the  Zak  transform  is  presented. 
This  material  forms  a  foundation  for  the  following  chapters  and  establishes  conventions 
with  regard  to  definitions  and  notation  that  will  be  followed  throughout  the  dissertation. 

2.2.  Some  Important  Domains 

In  this  dissertation,  reference  is  continually  made  to  three  different  domains  or 
planes.  These  are  the  time-lag  (or  t-x)  domain,  ambiguity  (or  0-t)  domain  and  time-fre¬ 
quency  (or  r-(D)  domain.  The  time-lag  plane  is  the  domain  of  the  bilinear  form  of  the  sig¬ 
nal,  Rj(t,x),  defined  as 

R/O.t)  =/(<  +  !)/•(<- 5)  (2.1) 


Ambiguity  Domain  or  Plane 

Domain  of  the  Ambiguity  Function, 
the  Generalized  Ambiguity  Function 
and  the  Characteristic  Function 


0 - 1 


Time-Lag  Domain  or  Plane 

^{t,x) 

Dotmin  of  the  Bilinear  Signal  defined  as. 


>(0 


ICO 


I  Time-Frequency  Domain  or  Plane 

Domain  of  the  Time-Frequency  Distribu¬ 
tion,  Time-Frequency  Representation  ami 
Probability  Distribution. 


Figure  2.1.  The  Domains  of  the  AF-GDTFD  and  Their  Relationships  One  to  Another. 
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where /is  a  one  dimensional  signal.  It  is  related  to  the  ambiguity  function  by  the  inverse 
Fourier  transform  of  /f/t.x)  with  respect  to  /.  The  ambiguity  plane  is  also  the  domain  of  the 
generalized  ambiguity  function  (the  product  of  the  bilinear  signal,  R^t,x),  and  the  ambigu¬ 
ity  plane  kernel,  <Kt.t)).  In  the  case  where  the  end  result  is  a  probability  density  function, 
the  generalized  ambiguity  function  is  also  called  the  characteristic  function.  The  ambigu¬ 
ity  plane  is  related  to  the  time-frequency  plane  via  a  two  dimensional  Fourier  transform. 
(See  Figure  2.1.) 

2.3.  Generalized  Time-Frequency  Distributions 

2.3.1.  Outer  Product  Formulation.  The  Generalized  Time-Frequency  Distribution 
(GTFD)  is  given  by  Cohen  [16] 

oo  oo 

^oo 

where  Y(^>'t)  is  the  kernel  in  the  time-lag  domain.  It  is  related  to  the  kernel  in  the  ambigu¬ 
ity  plane,  ^Q,x),  by 


Y(t,x)  =  j(l)(0,x)e 


(2.3) 


A  common  discrete  form  of  equation  (2.2),  called  the  Discrete  Time-Frequency  Dis¬ 
tribution  (DTFD)  is  given  by 


^-1 

2  r  N-\ 


C(n,k;(j>)  =  ^  ^  fiu  +  x)f*iu-x)\^in-u,x) 


\  jlnkx 

N 

e 


(2.4) 


where  N  is  the  number  of  samples  and  for  ease  of  computation  is  chosen  to  be  a  power  of 
two. 


2.2 


Figure  2.2a.  Rj(t,k)  -  Rectangular  Grid.  Figure  2.2b.  R^t.k)  -  Hexagonally 

Eiecimated  Rectangular  Grid. 

For  a  discrete  signal,/  the  values  ofJ(t),  te  Z,  are  known,  but  the  formulation 
f(u  +  x)f*(u-x),u,X€  Z,  includes  only  half  of  the  possible  bilinear  data  points.  To 
illustrate,  define  a  function  Rj(t,k)  such  that 

«/«.*)  =  <2-5) 

where  t,k^Z.  Now  examine  Rf  on  the  t-k  plane  for  the  case  of  t  and  k  restricted  to  inte¬ 
gers  values.  The  support  of  the  function  is  on  a  rectangular  sampling  grid,  but  it  does  not 
have  any  non-zero  values  for  k  odd.  Thus,  half  of  the  information  in  the  ife-direction  is  lost. 
Figure  2.2a  demonstrates  this  property.  This  inadvertent  decimation  in  the  k-direction 
means  that  the  GDTFD  has  only  half  of  the  bandwidth  theoretically  possible  for  a  given 
sampling  rate.  It  is,  therefore,  considered  to  be  periodic  in  7t  rather  than  2n. 

In  [29],  Jeong  and  Williams  note  data  is  available  whenever  the  argument  of/,  i.e. 
t  ±  k/2 ,  is  an  integer.  This  suggests  that  data  is  present  when  k  is  even  and  t  is  an  integer 
and  when  k  is  odd  and  t  is  an  integer  plus  one-half.  As  Jeong  and  Williams  point  out. 


R^t,k)  can  be  thought  of  as  lying  on  the  grid  shown  in  Figure  2.2b.  This  is  a  function 
which  has  been  sampled  at  intervals  of  1/2  in  t  and  k  and  then  hexagonally  decimated  by  a 
factor  of  four  (see,  for  example,  Vaidyanathan  [50],  p573  and  p648). 


By  taking  R^t.k)  to  be  on  the  hexagonally  decimated  rectangular  sampling  grid  (or, 
simply,  hexagonal  grid)  and  using  this  instead  of  the  rectangular  grid,  all  terms  of  the  dis¬ 
crete  signal,/,  are  being  used,  making  the  new  distribution  periodic  in  2n;.  This  is  the  basis 
of  the  Jeong  and  Williams  method  which  they  call  the  Alias-Free  Generalized  Discrete 
Time-Frequency  Distribution  (AF-GDTFD).  This  will  be  the  outer  product  form  of  the 
GDTFD  which  will  be  used  throughout  this  dissertation,  and  GDTFD  will  be  synonymous 
to  AF-GDTFD. 

2.3.2.  Inner  Product  Formulation.  The  inner  product  formulation  can  be  derived 
from  the  outer  product  by  means  of  a  double  change  of  variables.  Starting  with  (2.2),  let 
u  =  /j  -  r  -  x/2 ,  then 

oo  eo 

C^(r,0);v)  =  J  J /(r +  /,)/*(/  +  /,- X) r/T.  (2.6) 


Next,  interchanging  orders  of  integration  and  substituting  t2  =  t\-X  results  in  the  inner 
product  form,  [2]  [3]  [18]  [26] 


Cy.(r,(o;v) 


—OP 


X  [/(z  +  ye 


]  dt,  dtj 


where 


(2.7) 


2.4 


Figure  2.3.  Mapping  of  Points  in  y  to  ijir . 


Time  Shift  (by  r)  Operator  (S^f)  (x)  =  f{x-t) 
Frequency  Shift  (by  co)  Operator  (^)  =  /  (^) 


V  ( tj,  tj)  =  v(  "  ^2)  •  (2.9) 

One  advantage  of  the  inner  product  form  over  the  outer  product  form  is  seen  in  the 
discrete  case.  The  outer  product  form  requires  the  kernel  to  be  sampled  on  hexagonally 
decimated  sampling  grid  to  avoid  aliasing  [29].  This  can  cause  some  difticuities  when  de¬ 
signing  kernels  as  will  be  discussed  in  Chapter  4;  however,  the  inner  product  form  is  natu¬ 
rally  alias-free  (i.e.  it  is  2ic  periodic  rather  than  n  periodic).  The  sample  points  in 
V  (^i>  ^2)  ’  where  and  ^2  are  integers,  correspond  to  the  points  in  \\f  when  it  has  been 
sampled  on  the  hexagonally  decimated  grid.  Pictorially,  this  can  be  seen  in  Figure  2.3. 

A  GDTFD  can  be  generated  by  calculating  the  sum  of  weighted  spectrograms  [18]. 
The  weighting  factors  are  simply  the  eigenvalues  of  the  kernel  ^  >  and  the  window  func¬ 
tions  in  the  spectrograms  are  the  eigenvectors  corresponding  to  the  eigenvalues.  In  other 
words,  (2.7)  can  be  rewritten  as 
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Cj{t,  to;v) 


/(r  +  r,)  jr^* 


-72it(0(/  +  »,) 

e  at 
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where  is  the  k!^  non-zero  eigenvalue  and  jCj^  is  the  eigenvector  of  (jf .  With  this  formu¬ 
lation,  any  GDTFD  which  has  a  Hermitian  kernel  (and  hence  a  complete  set  of  eigenvec¬ 
tors)  can  be  written  as  the  sum  of  weighted  spectrograms. 

Only  those  properties  of  Time-Frequency  Distributions  needed  for  this  work  are  in¬ 
cluded  here.  Further  details  are  given  in  Cohen  [16]  and  Boashash  [8]. 


2.4.  Multirate  Background 

The  first  multirate  building  block  that  is  needed  is  the  decimator.  The  m  fold  decima- 
tor  takes  an  input  sequence  and  shortens  it  by  keeping  only  every  sample.  A  decimator 
can  also  be  used  on  multidimensional  signals.  In  this  dissertation,  both  one  and  two  di¬ 
mensional  signals  are  encountered.  The  one  dimensional  decimator  is  represented  by  a 
block  containing  the  down  arrow  symbol  (i)  and  the  decimation  factor,  i.e. 


Sample  0  12 

x(n)  =  x(0),x(l),x(2), ... 


Sample  0  1  2 

y(n)  =  x(0),x(m),x(2m), ... 


The  two  dimensional  decimator  is  a  two  by  two  matrix.  The  matrix  need  not  be  diagonal  in 
general;  however,  in  this  work  the  matrix  is  always  a  diagonal  integer  matrix  which  pro¬ 
duces  a  rectangular  decimation  scheme  (i.e.  the  decimation  in  one  dimension  is  indepen¬ 
dent  of  that  in  the  other).  The  decimation  matrix. 


= 


Wj  0 

0  m. 


(2.11) 


would  have  the  effect  of  decimating  a  two  dimensional  signal  by  m]  in  the  horizontal  (or 
x)  direction  and  m2  in  the  vertical  (or  y)  direction,  i.e. 
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The  final  multirate  element  to  be  defined  is  the  delay.  The  delay  can  also  operate  on 
multidimensional  signals.  In  this  dissertation,  both  the  one  and  two  dimensional  delays 
will  be  used.  The  one  dimensional  delay  by  m  would  delay  a  signal  by  m  samples.  The 
symbol  for  a  delay  is  z“”.  In  block  diagrams,  it  is  placed  above  an  arrow  indicating  a  delay 
between  the  blocks  being  connected  by  the  arrow,  i.e. 


-m 

x(n)  =  ...,x(m),x(m+  l),x(m  +  2),  y(n)  =  ...,x(0),x(]),x(2),  ... 

The  two  dimensional  delay  is  represented  by  where  |i  is  defined  by  (2. 1 1).  It  has  the 
effect  of  delaying  the  signal  in  the  horizontal  direction  by  m\  and  in  the  vertical  direction 
by  m2,  i.e. 
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Additional  information  on  digital  filter  design  using  multirate  signal  processing,  may 
be  found  in  Chen  and  Vaidyanathan  [14]  and  Vaidyanathan  [50].  Multirate  as  a  computa¬ 
tional  paradigm  is  discussed  in  Chapter  3. 
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For  the  purposes  of  this  dissertation,  computer  architecture  is  restricted  to  Multiple 
Instruction  streams  Multiple  Data  streams  (MIMD)  shared  memory  computers,  such  as  the 
Cray  YMP.  (For  more  information  on  this  type  of  architecture,  see,  for  example,  Hwang 
and  Briggs  [27].)  Thus,  the  phrase  “parallel  algorithm”  will  be  used  synonymously  with 
the  phrase  “parallel  algorithms  on  MEMO  shared  memory  computers.” 

An  important  concept  in  parallel  algorithms  is  synchronization  of  the  parallel  paths. 
Synchronization  is  a  place  in  the  algorithm  when  the  results  from  different  processing 
paths  must  be  recombined  before  the  algorithm  can  proceed.  A  place  in  the  algorithm 
where  synchronization  is  necessary  is  called  a  barrier  and  the  computations  done  between 
barriers  is  called  a  stage.  (See,  for  example,  [51].) 

Suppose  there  is  an  algorithm  which  has  three  stages.  It  would  have  four  barriers; 
one  at  the  start,  between  each  stage  and  one  at  the  end.  The  number  of  computations  nec¬ 
essary  in  any  given  path  to  go  from  barrier  to  barrier  is  not  necessarily  a  constant  amount 
between  processor  paths;  thus,  some  processors  will  finish  before  the  others  in  a  given 
stage  and  be  idle  while  the  others  complete  their  task.  The  greater  the  difference  between 
the  time  it  takes  to  compute  the  paths  in  a  given  stage  (i.e.  the  computational  cost  of  the 
pathways),  the  less  efficient  the  algorithm. 

For  example,  consider  the  algorithm  depicted  in  Figure  2.4.  The  length  of  the  hori¬ 
zontal  arrows  indicate  the  number  of  computations  (or  amount  of  time)  it  takes  to  process 
the  job  given  to  a  particular  processor.  Stage  one  has  a  good  balance  of  the  load  between 
the  processors,  and  as  such,  it  is  highly  efficient.  Stage  two,  on  the  other  hand,  is  poorly 
balanced  and,  as  a  result,  is  inefficient.  Stage  two  also  shows  a  sub-stage.  This  is  caused  by 
a  synchronization  which  is  necessary  between  some  fraction  of  the  total  processors.  If  the 
fraction  is  small,  this  would  create  a  sub-stage  rather  than  a  stage  since  most  of  the  proces- 
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Stage  1 

Barrier  0  Barrier  1 


Stage  2 


Stage  3 
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Processor  1 
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Processor  3 
Processor  4 
Processor  5 


Sub-Stage  2. 1  Sub-Stage  2.2 


High  Efficiency 
(Good  Balance) 


Low  Efficiency 
(Unbalanced) 


100%  Efficiency 
(Balanced) 


timA 


Figure  2.4.  Processor  Loads  Within  Stages  of  a  Parallel  Algorithm. 


sors  are  unaffected.  The  final  stage  is  perfectly  balanced  and  is,  therefore,  100  percent  effi¬ 
cient. 

For  additional  information  on  parallel  algorithms  and  associated  computer  architec¬ 
ture,  see,  for  example.  Van  Loan  [51],  Hwang  and  Briggs  [27]  or  Akl  [1]. 


2.6.  Tensor  Notation  Background 


The  tensor  product  (also  known  as  the  Kronecker  product)  of  two  matrices,  A  and  B, 
is  represented  by  the  symbol  A^B.  There  are  two  possible  definition  of  the  tensor  prod¬ 
uct:  the  left  tensor  product  and  the  right  tensor  product.  Since  both  are  equally  valid,  the 
right  tensor  product  has  been  chosen.  Suppose  Aisnxm  and  B\six  j,  then  the  right  ten¬ 
sor  product  is  defined  as  [25]  [43] 


^0,0®  ^0,1^  •  •  •  %m-\^ 

^1,0^  ^1,1^  •  •  • 


A  <B>B.  .  - 

n,m  i,] 


(2.12) 
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Jrti  X  mj 
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The  second  form  of  tensor  notation  used  is  the  direct  sum  represented  by  i4  ©  B .  It 


creates  a  block  diagonal  matrix  and  defined  as  [43] 


A®B 


A  0^ 
0  B 


(2.13) 


If  A  is  n  X  m  and  B  is  ix  j,  then  A  ©  B  is  (n  + 1)  x  (m  +  j).  For  the  set  of  matrices  {A,},  i  = 
0,...,N-  1,  the  direct  sum  is 


N-  1 

0A.  = 
1  =  0 
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0 


'^N-\ 
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Tensor  notation  also  makes  use  of  a  particular  type  of  permutation  matrix  called  the 
stride  by  m  permutation.  It  is  represented  by  PN,m  where  N  is  the  length  of  the  sequence 
being  permuted  and  m  is  the  stride.  The  effect  of  the  stride  permutation  is  to  rearrange  a 
vector,  X  s=  [jc(0)  jc(l) ...  x{N-  1)]^,  such  that  its  permutation,  Pf/^mX  =  x',  is  given  by 


(2.15) 


X  (n)  =  [jc(0)  x(m)  ...  x((L-l)m)  I  ...  Ix(m-l)  x{2m-  1)  ...  x (TV  -  1)] 

=  [xQ(n)  Xi(n)  ...  x.(n)  ... 

where L  =  iV/me  Zand 

x.{n)  =  [x(i)  x{m  +  i)  ...  x{mn  +  i)  ...  x (m (L- 1)  + /)]  ,  (2.16) 

ns=0, 1,  ...,L-  l.This  implies 

x'ik)  =  x^m(mk  modulo  TV)  +  ~  ^  (2.17) 

where  it  =  0, 1, ...,  TV-  1,  and  LJ  indicates  the  integer  portion  of  the  quantity. 


2.10 


It  is  important  to  note  that  the  stride  permutation  is  a  mathematical  operation  which 
is  often  implemented  as  an  addressing  operation  [25].  This  means  that  for  many  hardware 
configurations  it  can  be  implemented  at  almost  no  cost;  however,  the  mathematical  nature 
of  the  operation  must  never  be  forgotten. 

Additional  theorems  and  definitions  for  tensor  products  may  be  found  in  many 
sources,  including  Granata,  et  al,  [25],  Van  Loan  [51]  and  Regalia  and  Mitra  [43]. 

2.7.  Zak  Transform 

The  Zak  transform,  like  a  TFD,  transforms  a  one-dimensional  signal  into  a  two-di¬ 
mensional  representation.  It  comes  in  several  forms  [6][28],  but  only  the  finite  discrete 
Zak  transform  is  of  interest  in  this  work.  Suppose  there  is  a  continuous  one-dimensional 
signal, /c,  which  is  sampled  at  intervals  of  p  to  produce  a  discrete  signal,/,  then  the  finite 
discrete  signal  is  given  by^wp),  where  mg  (0, 1, V  - 1 },  V,  m,  L  e  and  N  =  Lm. 
The  two-dimensional  discrete  Zak  transform  of  the  finite  discrete  one-dimensional  signal, 
/,  is  given  by 

L-l 

Zj{n,k)  =  +  (2-18) 

1  =  0 

where /i  =  0,  l,...,m-  1  andA:  =  0, 1,...,L-  1. 

Another  way  to  look  at  (2.18)  is  to  consider  laying  out  the  samples  of/ in  a  two-di¬ 
mensional  array  indexed  by  the  integers  n  and  /.  Call  this  two-dimensional  function  zy(n,/). 
Figure  2.5  shows  this  mapping.  Then,  take  the  discrete  Fourier  transform  of  each  column 
of  z/n,[).  The  result  is  the  discrete  Zak  transform, 

Additional  information  regarding  Zak  transforms  may  be  found  in  Zak  [57],  Aus- 
lander,  Gertner  and  Tolimieri  [6]  and  Jansen  [28]. 
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Figure  2.5.  Mapping  of /into  Two-Dimensional  Array 


2,8.  COTCiMSiQn 

The  preceding  contains  the  fundamental  background  and  definitions  used  throughout 
this  dissertation.  It  is  important  to  keep  in  mind  the  relationship  of  the  three  domains 
(time-lag,  ambiguity  and  time-frequency)  and  to  understand  that  the  function  of  the  GTFD 
is  to  project  a  bilinear  signal  defined  in  the  time-lag  domain  onto  the  time-frequency  do¬ 
main.  It  is  sometimes  fastest  computationally  to  do  this  by  first  projecting  the  bilinear  sig¬ 
nal  from  the  time-lag  domain  into  the  ambiguity  plane  and  subsequently  projecting  it  into 
the  time-frequency  plane.  This  concept  is  the  basis  of  TFD’s. 

Multirate  is  a  means  of  sub-dividing  a  problem  into  smaller,  more  manageable,  piec¬ 
es.  The  basic  building  blocks  for  the  sub-division  process  are  the  decimator  and  the  delay. 
These  are  the  two  main  multirate  tools  used  throughout  this  dissertation,  and  they  will  be 
applied  to  the  problem  of  creating  a  Multirate  Time-Frequency  Distribution. 

With  this  foundation  laid,  a  new  computational  paradigm  will  be  introduced  in  the 
next  chapter.  Specifically,  multirate  will  be  examined  as  a  computational  paradigm  for  the 
solution  of  numerical  linear  algebraic  problems  and  for  two  special  signal  processing  cas¬ 
es:  the  Fast  Fourier  Transform  and  the  Discrete  Hartley  Transform. 
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A  new  multirate  computational  paradigm  is  presented  along  with  a  natural  way  to 
express  numerical  linear  algebra  and  signal  processing  algorithms.  This  is  the  first  step  to¬ 
ward  the  goal  of  creating  a  Multirate  Time-Frequency  Distribution.  The  paradigm  intro¬ 
duced  and  the  techniques  which  result  from  that  paradigm  form  the  basic  building  blocks 
oftheMRTFD. 

3.2.  Paradigm 

llie  field  of  multirate  signal  processing,  while  rich  in  its  potential,  has  remained  lim¬ 
ited  in  the  types  of  problems  it  has  been  used  to  solve.  From  the  start,  it  has  been  used  as  a 
means  to  design  and  implement  filter  banks  for  signal  processing  systems.  Multirate  has 
shown  itself  to  be  a  powerful  design  tool,  (see,  for  example,  [50])  but  it  has  the  potential  to 
be  used  to  solve  a  much  laiger  class  of  problems. 

Because  multirate  breaks  the  problem  into  smaller  independent  sub-problems  which 
require  fewer  computations  to  solve,  a  multirate  system  provides  three  benefits.  First,  for  a 
given  thrcNigh{Nit,  the  individual  components  can  operate  at  significantly  reduced  speed 
when  compared  to  a  non-multirate  systems  performing  the  same  task.  This  means  the  indi¬ 
vidual  components  can  be  much  less  expensive  to  produce.  Second,  by  reducing  the  re¬ 
quired  clock  rate  of  the  individual  components  the  heat  dissipation  and  power  consump¬ 
tion  budget  per  device  (and,  occasionally,  for  the  entire  multirate  system)  will  be  reduced. 
Finally,  if  throughput  is  the  governing  factor,  multirate  can  be  used  to  design  systems 
which  perform  faster  than  the  fastest  possible  sequential  system.  This  improvement  in 
speed  can  be  several  orders  of  magnitude  over  a  non-multirate  sequential  system.  The 
MRTFD  algorithms  presented  in  Chapter  6  are  examples  of  this  improvement  in  speed. 
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There  are  two  fundamental  underlying  assumptions  made  in  the  vast  majority  of  the 
multirate  literature;  the  process  to  which  multirate  is  applied,  and  the  type  of  data  on 
which  the  process  will  act.  Multirate  has  been  almost  exclusively  viewed  as  a  tool  for  de¬ 
signing  digital  filter  banks,  usually  for  the  purpose  of  encoding  or  compressing  a  signal. 
The  data  has  been  assumed  to  be  an  infinite  length  sequence.  These  assumptions  have  lim¬ 
ited  the  possible  applications  for  which  multirate  is  appropriate  and  profitable.  There  has 
been  a  limited  amount  of  filter  bank  design  which  has  been  done  with  finite  length  (or 
blocked)  input  data  (See,  for  example,  Giapter  10  in  [SO]  or  [45]).  The  utilization  of 
blocked  input  data  provides  a  starting  point  for  the  new  multirate  applications  discussed  in 
this  chapter. 

Underlying  multirate  is  a  larger  class  of  problem  solving  techniques  known  as  divide 
and  conquer  algorithms.  Divide  and  conquer  is  a  broad  field  of  algorithmic  techniques 
used  to  break  problems  into  smaller  indepentknt  sub-problems,  solve  those  sub-problems 
independently  and  recombine  their  results  to  gain  the  solution  to  the  entire  problem.  (See, 
for  example,  [12].) 

It  is  clear  that  multirate  partitions  a  given  problem  into  a  set  of  sub-problems  and  is, 
therefore,  a  divide  and  conquer  algorithm.  Can  it  be  applied  to  a  broader  class  of  prob¬ 
lems?  The  answer  is  a  resounding,  yes.  In  this  chapter,  multirate  will  be  shown  to  be  a 
powerful  tool  useful  as  a  means  of  rapidly  solving  problems  encountered  in  numerical  lin¬ 
ear  algebra  and  digital  signal  processing.  In  section  3.3,  the  utility  of  multirate  when  ap¬ 
plied  to  the  basic  building  blocks  of  numerical  linear  algebra  will  be  discussed.  In  section 
3.4,  some  examples  of  numerical  linear  algebraic  building  blocks  will  be  applied  to  signal 
processing  plications.  By  making  use  of  multirate  in  this  unconventional  sense,  signifi¬ 
cant  improvement  in  performance  is  possible  for  traditional  signal  processing  tasks.  This 
is  demonstrated  later  in  this  chapter  and  in  Chapter  5  and  Chapter  6. 
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Figure  3. 1 .  Single  Stage  Multirate  Summation. 


3.3.  Multirate  as  a  Paradigm  for  Numerical  Linear  Algebra 

Traditional  multirate  uses  finite  filters  to  process  a  non-finite  length  sequence.  What 
if  both  filter  and  sequence  were  considered  as  blocks?  This  is  called  block  filtering.  (See, 
for  example.  Chapter  10  in  [SO]  or  [45].)  Used  on  block  data,  it  has  an  inherent  benefit  of 
increasing  the  parallel  nature  of  the  filter  bank.  Moreover,  block  multirate  will  be  shown  to 
be  a  powerful  tool  to  implement  divide  and  conquer  algorithms  on  numerical  linear  alge¬ 
braic  problems.  In  this  section,  it  will  be  shown  that  multirate  ideas  can  be  applied  to  per¬ 
form  any  vector-vector,  matrix-vector  or  matrix-matrix  operation.  It  is  important  to  realize 
that  state-of-the-art  numerical  linear  algebraic  software,  such  as  LAPACK  [4],  is  built  up- 
(Hi  a  set  of  core  routines  called  Basic  Linear  Algebra  Subroutines  (BLAS)  [20][21]  which 
are  finequently  utilized  combinations  of  vector-vector,  matrix-vector  and  matrix-matrix  op¬ 
erations.  Thus,  it  will  be  shown  that  multirate  provides  an  important  new  paradigm  for 
solving  problems  in  numerical  linear  algebra. 

3.3.1.  The  Multirate  Inner  Product:  A  Vector- Vector  Operation.  The  standard  dis¬ 
crete  inner  product  is  given  by 

N-l 

=  X  •  (3.1) 

rt  =  0 
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x(m)y(m) 


x(mn  +  1) 


_ ,jc(iV-m-l)  - x[N-m-l)y(N-m-l) 

■  NM - w-  1)| - 

- 1  JfO)  - 777 -  Jr(l)y(l) 

-  >’(1)  - 

Jt(m+1)  n  Jc(/n+ l)v(m+ 1) 

y(m+l)  ' 


_ ,  x{N-m)  - - -  x(N-m)y{N-m) 

■  nM - - -  y(N-m)  - 


x(mn  +  m  -  1) 


n  =  0, 


Figure  3.2.  Two  Stage  Multirate  Summation  with  Analysis  Filters. 

Suppose  the  sequences  x  is  passed  through  a  delay  chain  and  decimated  by  N.  The  analysis 
filters  have  an  impulse  response  given  by  a  single  coefficient  corresponding  to  a  particular 
element  in  the  sequence  y.  Figure  3.1  is  then  a  iV  channel  multirate  system  which  sums  N 
elements  of  the  product  of  the  sequences  x  and  y.  That  is,  it  calculates  the  inner  product. 

Another,  equally  valid  way  to  decimate  x  is  to  decimate  in  two  stages.  For  example, 
the  first  stage  would  decimate  by  m  and  the  second  stage  would  decimate  by  N/m.  The 
analysis  filters  have  an  impulse  response  given  by  a  single  coefficient  corresponding  to  a 
particular  element  in  the  sequence  y.  In  other  words.  Figure  3.2  is  a  two  stage  multirate  in¬ 
ner  product  of  two  sequences.  This  is  expressed  mathematically  as 


m-\  m 


=  X  Y,x{nm-¥i)y{nm-^i) 


/  =  0  fi  =  0 
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Note  that  (3.2)  indicates  that  both  of  the  sequences  have  been  decimated  prior  to  pointwise 
multiplication  of  the  sequences.  This  is  considered  a  m  channel  multirate  system  since  in 
practice  it  would  be  expected  that  the  computation  done  after  the  m  fold  decimators  would 
be  performed  in  a  single  channel;  hence,  there  would  be  m  channels. 

3.3.2.  The  Multirate  Matrix- Vector  Multiply:  A  Matrix- Vector  Operation.  Multi¬ 
rate  may  also  be  applied  to  matrix-vector  operations.  In  the  case  of  the  vector-vector  oper¬ 
ation,  both  sequences  were  decimated  in  the  same  fashion.  In  the  matrix-vector  operation 
both  elements  will  also  be  decimated.  Since  the  matrix  is  a  two  dimensional  object,  some 
added  complexity  has  been  introduced  into  the  decimation  operation.  A  multirate  matrix- 
vector  operation  could  be  developed  by  decimating  the  matrix  in  only  one  direction,  name¬ 
ly  decimating  in  the  horizontal  direction  only;  however,  this  is  not  the  standard  way  an  ar¬ 
ray  is  decimated. 

Decimation  in  two  diineusions  requires  a  decimation  factor  for  both  dimensions.  The 
simplest  case  is  for  these  to  be  equal.  This  is  called  square  decimation  and  is  represented 
by  a  decimation  matrix 


H  = 


m  0 
0  m 


(3.3) 


The  multirate  decomposition  by  this  decimation  matrix  would  require  one  channel  for  ev¬ 
ery  possible  delay  in  either  dimension.  Since  their  are  m  possible  delays  in  each  direction, 
a  total  of  channels  are  required  creating  sub-matrices. 

Suppose  this  decimation  scheme  was  applied  io  A,mNxN  matrix  where  NIm  =  Le 
Z.  If  the  sub-matrices  were  put  in  block  matrix  form,  a  new  matrix,  call  it  A',  would  be 
created.  It  would  have  the  form 
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A’  = 


(3.4) 


^0, 0  ^0,  1  •  •  •  ^0,m-\ 

^1,0  "^1,  1  •  •  •  ^l,m-  1 

1,0  ^m-l,  1  •  •  ’  ^m-\,m-\\„xm 
where  the  subscripts  represent  the  delay  applied  to  A  before  it  was  decimated.  The  first 
subscript  is  the  delay  in  the  vertical  direction,  and  the  second  is  the  delay  in  the  horizontal 
direction.  The  the  block  matrix  Ajj  is  constructed  from  the  elements  of  A  in  the  following 
fashion: 

^i,j  ^i,j  +  m  ”  •  •  ^i,j+  {L-\)m 

^i  +  m,j  ^i  +  m,j  +  m  ‘  ’  *  +  m,  j  +  (L- 1) m 

=  •  •  •  (3.5) 

_^/+  [L-l)m,j  ^i+  (L-l)M,j  +  m  '  '  •  ^i+  (L-\)m,j+  ixL 

where  the  terms  are  the  elements  of  A. 

It  can  be  seen  from  the  definition  for  the  stride  by  m  operator,  (2. 17),  and  from  (3.4) 
and  (3.5)  that  A'  is  really  A'  =  m  elements  of  A'  are  ^  The 

value  I  is  the  delay  in  the  horizontal  direction,  j  is  the  delay  in  the  vertical  direction  and  p, 
n  =  0, 1,...,L-  1. 

The  signal,  jc,  must  also  be  delayed  and  decimated  but,  naturally,  in  only  one  direc¬ 
tion.  It  must  be  delayed  and  decimated  exactly  as  the  rows  of  ^4  are  (i.e.  in  the  horizontal 
direction).  Therefore,  there  are  m  sub-sequences  produced  by  m  channels.  If  the  sub-se¬ 
quences  of  X  from  each  channel  of  the  delay  and  decimation  chain  are  concatenated,  the 
result  is  the  sequence,  x',  seen  in  (2.15).  In  other  words,  it  is 

The  matrix-vector  product  can  be  written  as  the  product  of  A'  and  x',  i.e. 
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yU)  y(m  +  j)...yiN-m  +  j)\ 


Figure  3.3.  Multirate  Matrix- Vector  Multiplication. 

y  =  A'x  =  ^x.  (3.6) 

Note  the  result  is  /  and  not  y.  This  is  a  consequence  of  the  shuffling  of  the  rows  done  in 
(3.4).  By  (2. 17),  (3.6)  could  also  be  written  as  the  double  summation 


yipm  +  j)  =  X  Z  +  +  +  0  (3.7) 

/  a  0  n  =  0 

wherep  =  0,  1  andy  =  0,  l,...,m-  1 .  A  block  diagram  of  this  for  a  particular 

value  of  j  is  shown  in  Figure  3.3.  There  would  be  channels  (one  for  each  block  of  A') 
for  a  complete  implementation  of  (3.7). 

The  decimation  matrix,  p,  can  be  generalized  such  that  the  values  along  the  diagonal 
need  not  be  equal.  The  resulting  decimation  matrix  would  have  the  form 
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fA  = 


nij  0 

0  m-, 


(3.8) 


Changing  ji  will  change  (3.7)  and  allow  the  rows  and  columns  of  A  to  be  decimated  at  dif¬ 
ferent  rates.  Let  the  N  x  M matrix,  A,  be  decimated  according  to  |i,  then  the  new  multirate 
matrix-vector  product  is 


ffii  -  I  m, 

y{pm^-^k)  =  X  X  ""p/na  +  L  («"'!+'■)  (3-9) 

1-0  n  =  0 

wherep  =  0,  \,...,N/m2~  \,k  =  0,  I,...,  m2-  l.The  only  restriction  is  that M/mj,  yv/m2  e 
Z.  This  may  be  useful  in  certain  applications  such  as  under-defined  or  over-defined  sys¬ 
tems  of  equations  where  N^M.  This  is  the  multirate  matrix-vector  multiply. 

3.3.3.  Multirate  Matrix-Matrix  Multiplication.  Matrix-matrix  multiplication  can  be 
viewed  as  an  extension  of  the  matrix-vector  multiplication  discussed  in  the  previous  sec¬ 
tion.  In  the  product,  AB,  the  operation  repeatedly  calculates  a  matrix-vector  multiply  for 
each  column  of  B.  This  implies  that  the  decimation  of  B  in  the  vertical  direction  must  be 
the  same  as  that  of  A  in  the  horizontal  direction.  The  decimation  in  the  vertical  direction  of 
A  and  the  horizontal  direction  of  B  do  not  need  to  be  the  same,  but  these  will  impact  the 
structure  of  the  result. 

E>efine  three  decimation  factors,  m  j,  m2,  and  m3.  Let  mj  be  the  decimation  factor  in 
the  horizontal  direction  for  i4  and  the  vertical  direction  for  B.  Let  m2  be  the  decimation 
factor  in  the  vertical  direction  for  A,  and  m3  be  the  decimation  factor  in  the  horizontal  di¬ 
rection  for  B.  Then,  for  a  ^  x  Af  matrix.  A,  and  aMxK  matrix,  B,  define  A'  and  B'  as 
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A'  = 


‘0, 0  "’O.  1 


1,0  ^1,1 


*0,m,  -  1 


l.m,-  1 


fUj  1,0  mj  1,  I  ■  I,  m,  -  xm, 


B'  = 


®0, 0  ®0,  I 

®1,0  ^1,1 


0,  m  ,  -  I 
1,  m,  -  1 


®m,-l,0  •  •  •  ^m,-l,mj-l 


m,  X  m. 


^here 


= 


“u 


a 


N  M 
—  X  — 
m,  m. 


and 


= 


'.7 


(3.10) 


(3.11) 


(3.12) 


(3.13) 
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From  the  definition  for  the  stride  given  in  (2.15)  and  from  (3.10)  through  (3. 13),  it 
can  be  seen  that  A'  =  m,  ~  m,  •  product  ofAB^Y, 

then  the  product  of  A'  and  S'  is 


r  = 


3 


(3.14) 


and  applying  (2.17),  it  is  possible  to  write  the  equation  for  a  multirate  matrix-matrix  mul¬ 
tiply  as 


m,  -  1  /«! 


y pm^  +  q,  I'm,  +  k  ^  ^  ^ 


1  =  0  n  =  0 


pmj  +  q,  nm,  +  rnm^  +  i,  Im^  +  k 


(3.15) 


where fc  =  0,  l,...,m3-  1,^  =  0,  l,...,m2-  l,/  =  0,  1,...,  A/m3-  l,andp  =  0,  1,..., 

N/m2  - 1. 

With  the  addition  of  the  multirate  matrix-matrix  multiply  to  the  previously  defined 
multirate  vector-vector  and  matrix- vector  operations,  it  can  now  be  said  that  multirate  pre¬ 
sents  a  new  paradigm  for  numerical  linear  algebraic  problems.  As  a  demonstration  of  the 
utility  of  this  paradigm,  in  the  next  section,  the  new  multirate  linear  algebraic  tools  dis¬ 
cussed  above  will  be  used  to  construct  signal  processing  examples. 


3.4.  Multirate  as  a  Paradigm  for  Signal  Processing 

In  this  section,  two  examples  are  present  to  show  how  multirate  can  be  applied  in  a 
new  fashion  to  old  problems.  The  object  of  these  examples  is  to  recast  the  Fast  Fourier 
Transform  and  the  Discrete  Hartley  Transform  based  upon  the  paradigm  discussed  earlier. 
In  section  3.4.1,  the  multirate  FFT  (MR  FFT)  is  presented,  and  the  multirate  DHT  (MR 
DHT)  is  introduced  in  section  3.4.2 

3.4.1.  The  Multirate  Fast  Fourier  Transform.  In  this  section,  multirate  is  used  as  an 
alternate  means  to  develop  the  Four  Step  FFT  reported  by  Van  Loan  [51]  for  MIMD 
shared  memory  architectures.  The  key  feature  of  the  multirate  FFT  is  the  subdividing  of 
the  transform  into  smaller  blocks  which  can  still  be  solved  using  FFT’s.  This  section  is  di- 
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vided  into  three  parts.  First,  the  theoretical  background  for  the  MR  FFT  is  discussed  in 
3.4. 1.1,  and  the  MR  FFT  algorithm  is  covered  in  section  3.4. 1 .2.  Lastly,  the  cost  to  com¬ 
pute  the  MR  FFT  is  compared  to  the  cost  of  the  FR’  in  section  3.4. 1 .3. 

3.4. 1.1.  MR  FFl  Theory.  The  MR  FFT  is  derived  from  the  discrete  Zak 
transform.  In  Chapter  5,  it  will  be  shown  that  with  slight  modification  the  Zak  transform  is 
a  generalization  of  the  Short-Time  Fourier  Transform  (STFT).  In  this  section,  it  is  shown 
that  the  Zak  transform  is  the  basis  for  a  multirate  implementation  of  the  STFT  which  can 
make  use  of  the  computational  speed  of  the  FFT. 

The  definition  of  the  DFT  is  given  by  [39] 

N- 1  j2itnr 

^(0  =  ^ 
n  =  0 

If  all  values  of  r  are  considered  (i.e.  r  =  0, 1,..„  Af-  1),  (3.16)  is  a  ^ x  fV matrix-vector 
multiply.  Using  (3.9),  (3.16)  can  be  rewritten  as 

H-i 

-j2n(nm^■¥i)  (pm^  +  k) 

X{pm2  +  k)  =  xinm^-¥i)e  ^  (3.17) 

/  =  0  fi  =  0 

where  p  =  0,  l,...,N/m2-  1  andfc  =  0,  l,...,m2-  l.Letm  =  /ni  and  m2  =  L  =  iV/m,  then 

(3. 17)  can  be  expressed  as 

m-l  L-\  -j2it  (nm  + 1)  {pL  -f  k) 

X{pL  +  k)  =  X  ^x{nm  +  i)e  ^  (3.18) 

/  =  o  «  =  o 

where  nowp  =  0,  l,...,m-  1  andA:  =  0, 1,...,L-  1.  With  some  algebraic  manipulation, 

(3.18)  becomes 

m-\  r  -j^nik  i^_i  -j2nnk-[  -j2nip 

X(pL  +  k)  =  X  ^  ^  ^x{nm  +  i)e  ^  e  .  (3.19) 

/=o  L  n=0 
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The  innermost  summation  is  the  Zak  transform  which  is  then  multiplied  by  a  phase  shift 
e~^  .  The  outer  summation  is  a  DFT  of  the  rows  of  the  two  dimensional  product  of 

the  Zak  transform  and  the  phase  shift. 

Equation  (3. 19)  can  be  expressed  in  tensor  notation  as  a  combination  of  stride  per¬ 
mutations,  tensor  products  and  a  diagonal  matrix  multiply.  The  first  step  in  translating 
(3.19)  to  tensor  notation  is  to  apply  the  stride  permutation  to  the  input  vector,  x,  i.e.  P^npc. 
Next,  the  discrete  Fourier  transform  of  each  L  samples  of  the  permuted  input  must  be  tak¬ 
en.  This  is  represented  by  the  right  tensor  product  ®  where  Wi  is  the  Fourier  trans¬ 
form  matrix  of  dimension  L. 

The  phase  shift  term  in  (3. 19)  is  implemented  as  a  diagonal  matrix,  D.  It  is  the  direct 
sum  of  the  set  of  diagonal  matrices  {i2(/)}r  *  =  0, 1,...,  m  -  1,  defined  as 


n.(0 


-jlni/N 

e 


-j2ni(l-  \)/N 
e 


Ixl 


(3.20) 


The  resulting  diagonal  matrix,  D,  is  given  by 

m-l 

D=  ®a,,L 

'■  =  0  (3.21) 

Next,  a  permutation  is  performed  to  prepare  the  vector  for  the  second  set  of  DFT’s. 
The  permutation  is  actually  a  combination  of  two  strides,  Pj^  ^  and  Pj^  performed  si¬ 
multaneously.  The  first  removes  the  original  stride  of  m,  and  the  second  establishes  a  new 
stride  of  L  The  last  step  is  to  apply  the  tensor  product  for  the  L  m  point  DFT’s,  i.e. 

Il  ®  ,  followed  by  another  stride  to  restore  the  output  sequence  to  the  original  order. 

The  entire  tensor  product  form  of  (3.19)  is 

X  =  WJ  iP^,i^P^,JD{l^®W,)P,x.  (3.22) 
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It  should  be  noted  that  the  DFT  operations  and  Wi  would  be  implemented  using  se¬ 
quential  FFT’s.  This  implies  that  (3.22)  could  be  further  decomposed  in  terms  of  tensor 
products;  however,  this  would  confuse  the  parallel  nature  of  this  formulation  and  is  not 
done  here. 

3.4. 1 .2.  MR  hhT  Algorithm.  The  MR  FFT  algorithm  is  a  parallel  algo¬ 
rithm.  The  most  efficient  algorithm  in  terms  of  processor  loading  requires  a  dependence 
between  the  length  of  the  input  signal,  N,  and  the  decimation  factor,  m.  The  MR  FFT  algo¬ 
rithm  will  first  calculate  the  Zak  transform  (which  implies  the  one  dimensional  vector,  jc,  is 
mapped  into  a  two  dimensional  array  as  is  depicted  in  Figure  2.S)  and  apply  the  phase  shift 
in  the  first  stage  and  then  calculate  the  Fourier  transform  of  the  rows  in  the  second  stage. 
The  Fourier  transforms  are  calculated  using  off-the-shelf  FFT  algorithms.  Thus,  the  MR 
FFT  performs  a  FFT  on  the  columns  of  the  two  dimensional  array,  Zf  in  the  first  stage,  and 
a  FFT  of  the  rows  of  the  two  dimensional  array  in  the  second.  The  phase  shift  could  be 
performed  either  at  the  end  of  the  first  stage  or  the  beginning  of  the  second.  In  this  imple¬ 
mentation,  it  is  performed  in  the  first  stage.  If  each  FFT  is  assigned  to  an  independent  pro¬ 
cessor  in  each  stage,  the  only  way  to  have  the  same  number  of  row  processors  as  column 
processors  (keeping  the  processor  load  balanced  during  each  stage)  is  to  set  m  =  Jn.  A 
block  diagram  of  the  MR  FFT  can  be  seen  in  Figure  3.4, 

Algorithm  3. 1  is  another  way  to  describe  the  MEMO  shared  memory  Four  Step  paral¬ 
lel  FFT  reported  by  Van  Loan  [51].  Stage  one  contains  steps  one  and  two,  the  FFT 
phase  shift,  as  well  has  half  of  the  third  step,  transposition  of  the  data.  The  load  at  the  be¬ 
ginning  of  stage  two  completes  the  transpose,  and  the  fourth  step  is  the  calculation  of  L  m- 
point  FFT’s. 

It  is  important  to  note  that  the  multirate  paradigm  could  also  be  used  to  efficiently 
implenwnt  the  many  significant  Fourier  transform  algorithms  of  Winograd,  Tolimieri  and 
Auslander.  See,  for  example,  [7].  One  last  comment  on  the  algorithm  before  moving  on  to 
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Algorithm  3. 1 :  Multirate  Fast  Fourier  Transform 
I  Barrier  0:  Start 

Stage  1 :  ^  points  by  parsing  L  data  points  into  m  processors 

Each  processor  calculates  an  L  point  FFT 

I  Each  processor  multiplies  the  L  point  FFT  result  by  a  phase  shift 

Output  results  to  Output  Array 
Barrier  1: 

Stage  2:  ^  points  into  L  processors  from  Output  Array 

Each  processor  calculates  an  m  point  FFT 

I 

I  Output  results  to  Output  Array 

Barrier  2:  End 

1 

I  _ 

the  DHT.  The  operations  performed  on  each  set  of  L  data  points  in  stage  one  and  set  of  m 

data  points  in  stage  two  are  identical  in  each  processor.  This  implies  that  the  algorithm  is 

i 

suitable  for  implementation  in  a  SIMD  architecture  [27], 

3.4. 1 .3.  Computational  Cost  of  MR  FFT.  The  MR  FFT  like  the  Four  Step 
FFT  uses  multiple  instances  of  the  same  sequential  FFT  algorithm  implemented  in  paral- 

i 

I  lei.  The  MR  and  Four  Step  FFT’s  require  more  computations  than  a  sequential  FFT  would 

for  a  given  length  signal;  however,  because  they  are  implemented  with  shorter  length  se¬ 
quential  FFT’s  in  parallel,  the  execution  time  is  significantly  reduced. 

If  m  is  incorrectly  chosen,  it  is  very  easy  to  obtain  an  algorithm  which  is  not  very  ef¬ 
ficient.  In  considering  the  cost  of  this  algorithm,  it  will  be  assumed  that  both  N  and  m  are 
correctly  selected  such  that  m  =  Jn  results  in  practical  split  radix-2  FFT’s  in  both  stages 
of  the  algorithm  and  that  the  loading  of  the  processors  is  optimized. 

In  Table  3.1,  the  cost  of  the  algorithm  in  terms  of  real  multiplications  and  additions 
is  presented.  The  numbers  are  based  upon  the  split  radix-2  FFT  introduced  by  Sorensen, 
Heideman  and  Burrus  in  [46].  It  requires  N  \0g2N-  3N+ 4  multiplies  and  3N  \0g2N-  3N 


+  4  additions  to  calculate  an  N  point  DFT  via  the  split  radix-2  FFT.  This  FFT  is  the  basic 
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building  block  of  the  MR  FFT.  If  the  type  of  FFT  is  changed,  then  the  cost  of  computing 
the  MR  FFT  will  change  also. 

Assume  the  time  associated  with  communications  between  processors  and  memory 
transfers  can  be  neglected  and  there  are  as  many  parallel  processors  available  as  needed. 
Then,  the  time  it  takes  to  calculate  the  MR  FFT  and  the  FFT  can  be  compared  based  upon 
the  number  of  computation  along  the  longest  path.  This  is  defined  to  be  the  processor  path 
which  requires  the  greatest  number  of  computation  within  a  given  stage.  For  a  sequential 
algorithm,  this  is  the  total  number  of  computations  for  the  algorithm.  For  a  parallel  algo¬ 
rithms,  it  is  the  sum  of  the  longest  path  of  each  stage  in  the  algorithm. 

The  longest  path  for  the  sequential  FFT  is  the  number  of  real  multiplications  and  ad¬ 
ditions  it  takes  to  calculate  an  N  point  DFT.  For  the  MR  FFT,  the  cost  of  the  longest  path  in 
stage  one  depends  upon  the  time  it  takes  to  compute  a  single  N/m  point  FFT  followed  by 
N/m  complex  multiplies  to  apply  the  phase  shift  for  that  column  of  the  resulting  Zak  trans¬ 
form.  The  cost  of  stage  two  is  governed  by  the  cost  of  the  FFT  of  the  rows. 

The  time  to  compute  or  the  throughput  of  the  MR  FFT  system  is  determined  by  the 
cost  of  the  longest  path  in  stage  one  plus  the  longest  path  in  stage  two.  The  result  of  this  is 
the  Longest  Path  column  in  Table  3.1  Next,  the  Parallel  Complexity  is  calculated  which  is 
the  total  number  of  processors  necessary  times  the  Longest  Path.  The  total  number  of  cal¬ 
culation  necessary  to  perform  the  multirate  FFT  are  given  under  the  Sequential  Complexi¬ 
ty  column.  This  represents  the  calculation  that  would  need  to  be  done  if  the  algorithm 
where  implemented  on  a  single  processor. 

A  graphical  comparison  of  the  relative  time  it  takes  to  implement  a  N  point  MR  FFT 
and  FFT  is  given  in  Figure  3.5.  The  figure  shows  the  fraction  of  the  time  it  takes  to  calcu¬ 
late  the  MR  FFT  for  different  decimation  values  relative  to  the  FFT.  The  FFT  is  a  sequen¬ 
tial  algorithm  which  is,  therefore,  processed  along  a  single  path  and  is  defined  to  take  a 
time  of  one  to  implement.  The  MR  FFT’s  have  as  many  processors  as  necessary  to  achieve 
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Table  3. 1 :  Computational  Cost  of  Multirate  Fast  Fourier  Transform 


Oper¬ 

ation 

Stage  One 

Stage  Two 

Longest  Path 
m  = 

Parallel 

Complexity 

Sequential 

Complexity 

X 

-»Og2-  +  4 

n  m 

mlogjm 
-  3/n  +  4 

-ijN  +  5 

/Vlog2A' 

-3N+5jN 

Nlog2N-3^ 

+2jN+3 

+ 

3-log2-+4 
m  m 

3mlog2m 
-  3m  +  4 

37^tog2^ 
-3jN  +  5 

3Mog2^ 

-3N+5jN 

3N\og2N-3N 

+  2JN  +  3 

the  fastest  possible  throughput.  In  the  case  of  decimation  factors  of  four  and  eight.  This 
leads  to  large  numbers  of  processors  being  idle  in  the  hrst  stage  since  only  four  or  eight 
processors  are  needed  at  that  point,  but  N/4  and  N/%  processors  are  needed  in  the  second 


stage.  For  a  balanced  load  between  the  processors,  it  is  necessary  to  let  the  decimation  fac¬ 
tor  be  the  square  root  of  the  length  of  the  input.  This  is  the  tine  labeled,  “Efficient  MR 
FFT.”  For  the  case  of  N=  1024,  the  efficient  MR  FFT  would  require  32  processors  and 
would  be  approximately  31.6  times  faster  than  the  sequential  FFT. 


3.17 


3.4.2.  The  Multirate  Discrete  Hartley  Transform.  As  a  further  example  of  the  use¬ 
fulness  of  the  paradigm  presented  in  this  chapter,  the  Discrete  Hartley  Transform  is  exam¬ 
ined  in  this  section.  This  section  is  divided  into  two  sub-sections:  first,  the  theoretical 
background  for  the  multirate  implementation  of  the  DHT  is  presented  in  3.4.2. 1 .  Next,  the 
MR  DHT  algorithm  is  given  in  3.4.2.2 

3.4.2. 1 .  MR  DHT  Theory.  Neng-Chung  Hu,  Hong-I  Chang  and  O.  K.  Er- 
soy  developed  a  modified  DHT  in  [32]  called  the  generalized  DHT  (GDHT).  The  authors 
state  that  it  generalizes  the  DHT  the  same  way  the  DFT  is  generalized;  however,  a  more 
useful  connection  for  this  example  is  the  similarity  between  the  relationship  of  the  DHT  to 
the  odd  GDHT  and  relationship  of  DCT-I  to  DCT-II  and  DST-I  to  DST-II.  The  odd  GDHT 
is  given  by 


where  k  =  0,...,N-  1.  Applying  (3.9),  the  two  stage  multirate  formalism  for  matrix- vector 
products,  (3.23)  can  be  rewritten  as 


m-  1  L-i 

X(pL  +  f)  =  X  S 

1-0  «  =  0 


2ic^n/n  +  *  + 


{pL  +  f) 


V 


N 


xinm  +  i) 


(3.24) 


wherep  =  0, 1 . /n-1  and/=0, 1,...,L- 1,L  =  A//m  and  cas{a)  =  cos  (a)  4- sin  (a) . 

Making  use  of  the  identities. 


cos  (a  -4  P)  =  cos  (a)  cos  (p)  -  sin  (a)  sin  (p) 
sin  (a  -4  P)  =  sin  (a)  cos  (p)  +  cos  (a)  sin  (P) 

equation  (3.24)  can  be  expressed  as 


(3.25) 
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cas|^ (A(-m  (n  +  1)  +  i) 


since 


Define 


cas(?52M±i>)  =  cas(^). 
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Then,  (3.26)  can  be  expressed  as 
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Because  cosine  is  an  even  symmetric  function, 


cos 


m 


=  cos 


m 


(3.27) 


(3.28) 


(3.29) 


(3.30) 


(3.31) 


and  sine  being  an  odd  symmetric  function  means 
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'(■'4 


=  -sin 


(3.32) 


for  1  =  0,  1,...,  m/2  -  1.  This  implies  (3.30)  can  be  reduced  to 


X{pL  +  f)  = 


(3.33) 


(  w-i  +  + 

(r)  I  I 

p  \  i=o  \  y  1=0  V  y 


where  M  =  m/2, 


^.(/)  =  cos  <  X,(/)+sin  ■  X,(/) 
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+  cos 


,  (3.34) 


X/U)  = 


-  >  2f,(/)-sin 


- -k-,--,(/)-sin  — ^  k-,-,(/) 


(3.35) 


Jm  p  =  o 

p=  1,2,  ...,M- 1. 


(3.36) 


This  is  the  multirate  DHT. 
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In  ottier  words,  (3.33)  is  the  sum  of  the  type  II  Discrete  Cosine  Transforms  of 
k,  if)  and  the  type  II  Discrete  Sine  Transforms  of  X.'  (/)  .  (For  more  information  on  the 
DCT  and  DST,  see  [42].) 

Equation  (3.33)  directly  calculates  the  values  for  p  =  0,  1,...,  Af  -  1.  The  values  of 
X{pL  +j0  for  p  =  M,  M  +  1,. . .,  w  -  1  are  found  indirectly.  Prior  to  adding  the  results  of  the 
Cosine  and  Sine  transforms  together,  the  value  of  the  transforms  for  p  =  M,  M  +  1 , . . . ,  m  - 
1  must  be  calculated.  For  the  DCT-II,  they  are  given  by  -X^i(m  -  p  +  1  )L  +/)  for  p  =  M  +  1 , 
M  +  2,. . . ,  m  -  1  (Xj,  being  the  result  of  the  Cosine  transform).  The  value  of  the  DCT-II  for 
M  is  zero.  For  the  DST-II,  the  value  of  the  transform  for  p  =  Af  +  1 ,  Af  +  2,. . . ,  m  -  1  is  giv¬ 
en  by  X,((m  -  p  +  \)L+J),  and  for  p  =  Af,  it  is  the  sum 

M-\ 

=  I  (3.37) 

i  =  0 

The  DHT  can  also  be  expressed  in  terms  of  tensor  notation  as  was  done  for  the  MR 
FFT.  As  with  the  MR  FFT,  the  first  step  is  to  perform  a  stride  by  m  permutation,  i.e. 

The  permuted  input  is  broken  into  two  paths.  The  first  path  is  operated  on  by  ® 
where  Hi  is  the  DHT  operator  of  dimension  L.  The  second  path  is  permuted  by  the  inver¬ 
sion  operator,  I„^Ji,  where  Ji  is  defined  as 

(3.38) 

LxL 

and  then  operated  on  by 

The  result  from  each  path  is  again  split  into  two  branches.  In  the  first  branch  the  re¬ 
sult  is  multiplied  by  a  diagonal  operator,  D^.,  and  the  second  branch  is  multiplied  by  the  di¬ 
agonal  operator,  Dy  The  diagonal  operators  are  the  direct  sums 
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(3.39) 


m  -  1 

^a= 

i  =  0 

for  a  equal  to  c  or  s  and  Q”,  is  defined  as 

1 

(Iniil-  1) 

respectively. 

The  four  branches  are  recombined  such  that  the  first  branch  of  the  first  path  is  added 
to  the  second  branch  of  the  second  path  (call  this  recombined  path  one)  and  the  second 
branch  of  the  first  path  is  subtracted  from  the  first  branch  of  the  second  path  (call  this  re¬ 
combined  path  two).  The  four  branches  have  thus  been  recombined  into  two  paths. 

The  paths  are  permuted  by  a  combination  stride  permutation  operator, 

Once  the  permutations  have  been  performed,  the  first  path  is  operated  on  by  and 

the  second  path  is  operated  on  by  ®  where  €„  is  the  kernel  with  elements  cos(2jtip/ 
m)  and  5^  is  the  kernel  with  elements  smilnip/ni),  i,p  =  0,  1,...,  w  -  1.  Q,  and  Sf„  are 
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modified  Cosine  and  Sine  Transforms.  Each  block  of  the  operators  <8>  and  ® 
are  calculated  by  first  forming  X,  (/)  and  X'  (/)  and  then  taking  the  DCT  and  DST,  re¬ 
spectively. 

The  tensor  notation  version  of  the  MR  DHT  can  now  be  stated  as 

=  PN,Li(JL®CJP^^,P^^JDJI^<S>H,)P^^xin)  (3.42) 

+  Ds  iL  ®  "l)  (L  ®  4)  PN,nr^  («)  1  +  (^Z.  ®  P^  ,P^  „, 

(L  ®  "l)  (L  ®  •//.)  («)  -  (  An  ®  "z.)  n.^  (”)  ]  > 

3.4.2.2.  MR  DHT  Algorithm.  The  first  stage  of  the  MR  DHT  implements 
the  2m  L-point  DHT’s  calculated  using  a  standard  Fast  DHT  algorithm  such  as  that  given 
in  [47].  There  are  m  DHT’s  associated  with  the  decimated  signal  and  m  associated  with  the 
decimated  and  reversed  signal.  The  second  stage  performs  the  multiplication  by  ^  for  a 
=  c  and  s.  For  each  i,  there  are  four  sets  of  multiplications,  ,  with  the  DHT  of  the  deci¬ 
mated  signal  and  the  decimated  and  reversed  signal  and  ,  with  the  DHT  of  the  deci¬ 
mated  signal  and  the  decimated  and  reversed  signal. 

Each  combination  is  calculated  in  independent  processors.  For  maximum  speed,  this 
requires  4m  processors,  but  maximum  efficiency  is  obtained  with  2m  processors.  The  algo¬ 
rithm  given  here  is  the  maximum  efficiency  version.  The  results  of  the  multiplica¬ 
tions  are  summed  together  producing  2m  L-point  sequences  which  are  output  to  two  L  x  m 
arrays.  The  first  array  is  the  solution  of 

/!,■(/)=  [cos(^jjr,(/)+sin(^)ii(/)]  i€  I0,m)  .  (3«) 

and  the  second  array  is  the  solution  of 

«,(/)=  [cos(^)xi(/)-sin(^)x,{/)]  .•e(0,m)  (3.44) 

from  equation  (3.30).  This  is  the  end  of  the  second  stage. 


3.23 


The  final  stage  calculates  the  modified  cosine  and  sine  transforms  of  the  rows  of  the 
two  arrays,  respectively,  and  sums  the  result  to  produce  the  permuted  DHT,  X(pL  +/).  A 
block  diagrams  of  the  multirate  DHT  is  shown  in  Figure  3.6.  The  MR  DHT  algorithm  is 
given  in  Algorithm  3.2. 

Algorithm  3.2:  Multirate  Discrete  Hartley  Transform 
Barrier  0:  Start 

Stage  1 :  ^  points  by  parsing  N/m  data  points  into  2m  processors.  {This 

includes  both  the  decimated  data  and  the  decimated  and  reversed  data 
which  go  to  separated  groups  of  processors.) 

Calculate  the  DHT  of  each  subsequence 

Barrier  1: 

Stage  2:  Swap  data  between  DHT  processor  pairs 

Multiply  by  Cosine  and  Sine  terms 
Output  results  to  intermediate  result  array 
Barrier  2: 

Stage  3:  intermediate  result  array  into  processors 

Calculate  modified  Sine  and  Cosine  transforms 
Output  result 
Barrier  3:  End 


Like  the  multirate  FFT,  the  multirate  DHT  performs  identical  operations  in  each 
channel  on  differing  sets  of  data  (all  of  the  same  length)  meaning  there  is  only  one  instruc¬ 
tion  set  for  a  number  of  processors.  This  implies  that  the  algorithm  is  suitable  for  imple¬ 
mentation  in  a  SIMD  architecture  [27]. 

31?,  Cpnclygiph 

In  this  chapter,  a  new  multirate  paradigm  was  introduced.  The  idea  that  multirate 
was  limited  to  the  area  of  filter  bank  design  was  discarded  and  replaced  by  a  realization 
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Figure  3.6.  Block  diagram  of  multirate  Discrete  Hartley  Transform. 
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that  multirate  is  a  powerful  divide  and  conquer  strategy  for  the  entire  held  of  Numerical 
Linear  Algebra. 

Multirate,  as  a  computational  paradigm,  was  successfully  applied  to  vector- vector, 
matrix-vector  and  matrix-matrix  operations.  Since  these  operation  form  the  foundation  of 
modem  numerical  linear  algebra,  it  can  be  seen  that  multirate  as  a  computational  paradigm 
may  be  applied  to  all  problems  encountered  throughout  the  held  of  numerical  linear  alge¬ 
bra. 

Using  the  multirate  numerical  linear  algebraic  operations  developed  here,  the  new 
multirate  paradigm  was  applied  to  two  commonly  used  algorithms;  the  FFT  and  the  DHT. 
The  resulting  multirate  algorithms  are  parallel  implementations  of  the  FFT  and  DHT,  re¬ 
spectively.  They  require  slightly  more  computations  to  calculate;  however,  when  pro¬ 
cessed  in  parallel,  they  can  deliver  dramatic  increases  in  throughput. 

It  has  now  been  demonstrated  that  multirate  as  a  computational  paradigm  is  suitable 
for  all  numerical  linear  algebraic  problems  including  many  encountered  in  signal  process¬ 
ing.  It  remains  to  apply  the  paradigm  to  the  problem  of  the  GDTFD.  The  hrst  step  in  this 
process  is  to  design  the  computational  tools  which  will  allow  any  discrete  kernel  to  be 
used  with  the  multirate  TFD.  In  Chapter  4,  these  tools  are  developed. 
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4.  Kernel  Design  Techniques  for  Alias-Free  Time-Freauencv  Distributions 


4.1,  Introduction 

The  kernels  used  for  the  GDTFD  are  designed  in  either  the  time-lag  domain  or  the 
ambiguity  domain.  In  order  for  them  to  be  used  with  the  MRTFD  presented  in  the  follow¬ 
ing  chapters,  they  must  be  discretely  represented  in  the  time-lag  domain.  For  certain  ker¬ 
nels,  this  can  be  difficult.  Therefore,  in  this  chapter,  three  different  methods  are  presented 
which  allow  kernels  developed  in  either  domain  to  be  discretely  represented  in  the  other. 

Recently,  Jeong  and  Williams  introduced  a  generalized  method  to  calculate  an  Alias- 
Free  Generalized  Discrete  Time-Frequency  Distribution  (AF-GDTFD)  [29][30].  In  this 
dissertation,  the  term  AF-GDTFD  will  be  used  synonymously  with  the  term  GDTFD  as 
the  AF-GDTFD  is  the  form  of  GDTFD  which  is  of  interest  in  this  work.  Rather  than  using 
only  half  of  the  possible  sample  points  in  the  discrete  bilinear  signal  as  the  Discrete  TFD 
(DTFD)  does,  their  method  makes  use  of  all  the  available  samples  of  the  bilinear  signal. 
Since  the  DTFD  uses  only  half  of  the  available  sample  points,  it  has  a  discrete  bandwidth 
of  7C  and  will  suffer  aliasing  for  a  Nyquist  sampled  signal.  The  Jeong  and  Williams 
GDTFD  is  2n  periodic  since  it  keeps  all  the  sample  points  and  is,  therefore,  alias-free  for  a 
Nyquist  sampled  signal. 

The  GDTFD  is  an  0(N^)  process  and  requires  the  continuous  form  of  a  kernel  in  the 
time-lag  plane  be  known.  The  modifications  to  the  basic  GDTFD  decrease  the  computa¬ 
tional  complexity  to  0(N^  log  N).  They,  also,  allow  the  use  of  kernels  designed  in  the  am¬ 
biguity  plane  where  the  continuous  form  of  the  kernel  in  the  time-lag  plane  is  unavailable 
or  difficult  to  obtain. 

This  chapter  is  divided  into  two  sections.  First,  a  description  of  the  modifications 
made  to  the  alias-free  algorithm  developed  by  Jeong  and  Williams  [29]  to  create  an 
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OiN^og  N)  algorithm  is  given,  and  second,  a  simple  method  to  modify  kernels  designed  in 
the  ambiguity  plane  to  the  alias-free  form  is  developed.  The  ability  to  design  kernels  in  the 
time-lag  domain  will  be  critical  to  the  development  of  the  Multirate  Time-Frequency  Dis¬ 
tribution. 


4.2.  A  Faster  GDI 


Assume  the  number  of  samples,  N,  is  a  power  of  two.  Then,  the  method  reported  by 
Jeong  and  Williams  in  [29]  is  an  0(N^)  process.  With  a  slight  modification,  the  GDTFD 
can  be  implemented  in  0{N^  log  N)  computations.  The  faster  form  is  based  upon  the  Co¬ 
hen’s  class  using  the  form 


C(r,Q);<|))  =  ||a (0,t)<{i (6, x)e  ^  dxdQ  (4.1) 

where  A  (r,  x)  =  ^J/(^  w  ^  ~  ambiguity  function  off.  The  dis¬ 

crete  form  of  (4.1)  is  given  by 


C(n,  *;<!>)  = 


jlnup  -jlnpn  -jlnkx 
N  N  N 


k  =  -N 


P  =  -2 


N 


To  calculate  the  GDTFD,  the  first  step  is  to  determine  the  ambiguity  function  of  /.  It 
is  found  by  calculating  the  inverse  FFT  of  the  rows  of  /?^M,k) — an  0(iV^  log  N)  process. 
Next,  the  product  of  the  kernel  and  the  ambiguity  function  is  determined — an  OfN^)  pro¬ 
cess.  Finally,  the  two-dimensional  FFT  is  performed  to  find  the  GDTFD — an  0(Af^  log  N) 
process.  Thus,  the  overall  process  is  0{N^  log  N). 

In  comparison,  the  method  given  in  [29]  makes  use  of  the  form 


Cin,k;^)  -  ^[/?J^(n,x)],^j  (4.3) 

=  jr  X 

L/n  6  Z  J  T  — >  A 
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Figure  4. 1  (a)  The  Region  of  Support  for  the  Bilinear  Form  of 
the  Signal  (i.e.  RA.  (b)  The  Region  of  Support  for  a  Kernel  of  the 
“Bow-Tie”  Class. 


where }/  is  the  appropriately  sampled  and  shifted  kernel  in  the  time-lag  plane  and  iTis  the 
Fourier  transform  operator.  The  calculation  of  the  linear  convolution  within  the  brackets 
requires  0(N^)  operations  to  complete  followed  by  an  0(yV^  log  N)  operation  to  calculate 
the  Fourier  transform  of  the  columns;  thus,  the  overall  process  is  dominated  by  the  linear 
convolution  and  is  O(iV^). 

Because  of  the  fixed  overhead  associated  with  the  FFT,  for  small  N  (e.g.  N  <  16) 
solving  the  linear  convolution  directly  may  be  faster.  Also,  the  support  of  the  kernel  plays 
a  role  as  well.  The  support  of  R^is  diamond  shaped  (See  Figure  4.1a.).  If  the  kernel  is  of 
the  “bow-tie”  class  (See  Figure  4.  lb.),  then  aliasing  does  not  occur  and  the  circular  convo¬ 
lution  implemented  by  the  FFT  creates  no  error.  However,  if  the  kernel  extends  beyond  the 
“bow-tie”  region,  then  aliasing  will  occur  if  the  rows  of  the  kernel  and  the  bilinear  signal 
are  not  zero  padded.  This  will  result  in  the  cross-over  point  (i.e.  the  point  at  which  the  FFT 
method  is  faster  than  the  original  method)  increasing  by  a  factor  of  four. 

One  subtlety  is  encountered  as  a  result  of  using  the  DFT.  For  the  odd  rows  of  the  ker¬ 
nel,  the  sampling  shifts  for  the  kernel  and  the  bilinear  signal  must  be  in  opposite  direc¬ 
tions.  For  example,  if 
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Rfit,k) 


/(>+^)/*(<-|)  *even 


(4.4) 


then  kernel,  ^,k),  which  is  the  inverse  Fourier  transform  in  the  p  direction  of  a  kernel 
function,  y,  defined  in  the  t-k  plane,  is  given  by 


^{n,k)  = 


N/2-l 


I 

k  even 

/  =  -N/2 

N/2-i 

I 

k  odd 

t  =  -N/2 

(4.5) 


This  is  now  a  discrete  formulation  which  uses  the  complete  sampling  bandwidth  and  is  27C 
periodic  instead  of  k  periodic  and  can  be  implemented  in  0(N^  log  N). 


4,3.  A  Simple  Method  to  Design  Alias-Free  Kernels 

One  problem  with  the  GDTFD  is  the  calculation  of  an  appropriately  sampled  kernel, 
y.  Often,  it  is  convenient  to  design  kernels  in  the  ambiguity  plane  where  the  kernel  acts  as 
a  two-dimensional  multiplicative  filter.  Unfortunately,  that  filter  must  be  altered  such  that 
the  values  used  are  equivalent  to  the  inverse  Fourier  transform  of  t)r  on  the  hexagonal  grid. 
This  will  be  called  the  shifted  kernel  function, 

4.3.1.  Method  1:  The  Continuous  Kernel  (CK)  Method.  Although  Jeong  and  Will¬ 
iams  address  the  continuous  kernel  in  the  time-lag  domain  [29],  no  woric  has  previously 
been  done  with  continuous  kernel  in  the  ambiguity  plane.  The  most  straightforward  meth¬ 
od  of  calculating  the  shifted  is  to  select  a  continuous  ambiguity  domain  kernel, 
4K0,t),  which  meets  some  user  defined  criteria  and  to  take  its  Fourier  transform  in  the  9  di¬ 
rection.  This  yields  the  continuous  time-lag  domain  kernel,  \)r.  The  shifted  sampled  ^n,k) 
(note  this  is  now  discrete)  is  then  found  via  equation  (4.3). 

4.3.2.  Method  2:  Interpolation  Method.  The  second  method  approaches  the  prob- 
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lem  of  finding  the  shifted  ^n,k)  in  a  different  manner.  Ncic  from  Chapter  2,  the  outer 
product  formulation  is  actually  performing  a  convolution  of  a  kernel  defined  in  t-x  with  the 
bilinear  signal.  Examine  the  convolution  in  more  detail.  Let  a  given  even  indexed  row  of 
the  rotated  outer  product  of  the  signal  be  represented  by  X[k],  an  odd  indexed  row  be  rep¬ 
resented  by  X[k+l/2].  Further  let  an  even  indexed  row  of  the  kernel  be  given  by  Y[k]  and 
for  an  odd  row  let  it  be  Y[k-l/2].  The  result  of  the  convolution  of  the  even  rows  of  the  ker¬ 
nel  and  bilinear  signal  is 

X[it]*yW  =  !F[x[n\y[n]]  =  Z[k]  ,  (4.6) 

and  the  convolutions  of  the  odd  rows  of  the  kernel  with  the  odd  rows  of  the  bilinear  signal 
is 

(4.7) 

=  !F[x[n]y[n]]  =  Z[k]. 

Thus,  the  convolution  of  both  the  even  and  odd  rows  results  in  a  function  sampled  only  at 
integer  values  (i.e.  rectangular  sampling).  Note  that  it  is  necessary  in  finite  circular  convo¬ 
lution  for  the  shifts  of  X  and  Y  to  be  in  opposite  direction. 

The  above  result  is  the  justification  for  attempting  to  find  the  shifted  kernel  ^n,k). 
Suppose  the  kernel  were  not  shifted.  Equation  (4.6)  would  be  unchanged,  but  (4.7)  would 
become 

x[t+i]»l'[*l  =  (4.8) 

=  =  z[*  +  l]. 

From  this  it  is  seen  that  the  result  of  not  calculating  the  shifted  ^n,k)  is  the  convolution  of 
the  rotated  outer  product  of  the  signal  and  kernel  will  still  be  on  the  hexagonally  decimat¬ 
ed  rectangular  grid. 
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This  structure  can  be  exploited  by  interpolating  to  hll  in  the  missing  elements,  take 
the  Fourier  transform  along  x  and  create  a  TFD  which  is  27t  periodic  and  has  twice  the  res¬ 
olution  in  time;  hence,  the  time  axis  in  the  examples  for  the  interpolation  method  is  twice 
the  length  of  the  frequency  axis. 

4.3.3.  Method  3:  Phase  Shift  Method.  Another  method  which  could  be  used  to  find 
the  shifted  ^n,k)  is  to  sample  <(>(0,T)  on  a  rectangular  grid  at  intervals  of  1/2  along  0  for  x 
odd,  take  its  discrete  Fourier  transform  and  decimate  the  result.  This  is  equivalent  to  hav¬ 
ing  sampled  =  odd  integer)  at  intervals  of  1/2.  If  \|r  is  decimated  by  taking  the  odd  in¬ 
dexed  samples  (i.e.  \|/(t=0.5,  x  =  odd  integer),  y/(/  =  1 .5,  x  =  odd  integer), . . .)  and  take  the 
inverse  DFT  of  the  resulting  sequence,  then  the  result  is  the  shifted  4>(n,k). 

This  method  is  exactly  equivalent  to  the  well  known  transform  relationship, 

^j2nln/N^  [n]  ^X[k-l],  (4.9) 

where  x[/ij  <->  represent  a  /V  point  DFT  pair  [39].  As  a  result,  this  is  called  the  phase 

shift  method.  In  practice,  the  phase  shift  relationship  is  used  rather  than  decimating  the 
oversampled  DFT  and  inverting. 

With  /=0.5,  the  shifted  ^(n,k)  can  be  easily  be  determined  by  starting  with  the  un¬ 
shifted  ^n,k)  and  modifying  that  as  follows: 


<|>  («,  k) 


<|)  («,  k)  k  even 
^jnn/N^  (,1^  fc)  k  odd. 


(4.10) 


4.4.  Comparison  of  Methods. 

The  Butterworth  kernel  [40]  demonstrates  the  difficulty  that  can  arise  using  the  CK 
method.  The  Butterworth  kernel  is  given  by 


<|)(0,t) 


(4.11) 
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where  tj  and  Bj  are  constants  greater  than  zero  and  N  and  M  are  integers  greater  than  zero. 
Using  contour  integration,  the  Fourier  transform  of  (4. 1 1 )  is 


“"enTJ 


2N 

—  +Z 
T  ^ 


where 


(4.12) 


(4.13) 


For  a  more  detailed  discussion  of  the  calculation  of  the  time-lag  Butterworth  kernel  and  a 
slightly  different  resulting  form  see  [40]. 

While  (4.12)  does  have  a  closed  form  solution,  it  does  not  follow  that  all  kernels  de¬ 
fined  in  the  ambiguity  plane  have  an  inverse  Fourier  transform  in  the  0  direction.  Another 
problem  occurs  when  \|/(t,x)  is  sampled  and  the  inverse  discrete  Fourier  transform  is  taken. 
Since  neither  (4.1 1)  or  (4.12)  have  finite  support,  sampling  the  waveforms  with  a  finite 
number  of  points  over  a  finite  window  is  going  to  introduce  error,  and  for  this  case,  the 
DFT  magnifies  the  error.  For  example,  the  DFT  of  (j>(n,k)  where  is  128  bears  only  little 
resemblance  to  \f{p,k).  (See  Figure  4.4.)  The  result  is,  in  fact,  two  different  kernels  which 
will  asymptotically  approach  each  other  as  the  number  and  density  of  samples  goes  up. 
The  question  which  naturally  arises  is  which  one  is  best?  About  the  only  thing  that  can  be 
said  is  that  it  depends  upon  the  particular  signal,  kernel  and  choice  for  N. 

Unfortunately,  the  CK  method  requires  the  continuous  form  of  the  kernel  in  the 
time-lag  plane.  Thus,  for  a  kernel  defined  in  the  ambiguity  plane,  if  a  closed  form  solution 
to  the  continuous  Fourier  transform  does  not  exist,  the  CK  method  will  not  work.  Further, 
in  applications  where  the  kernel  is  changing  on  a  regular  basis  but  cannot  be  predeter¬ 
mined,  it  may  be  impractical  or  impossible  to  use  the  CK  method.  As  a  result  of  the  re- 
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Figure  4.2.  Result  of  Convolution  of  Impulse  and  Kernel  (a)  Before  and 
(b)  After  Interpolation. 


strictive  requirement  of  needing  the  continuous  form  of  the  kernel  in  the  time-lag  plane, 
the  CK  method  is  not  as  robust  as  the  Interpolation  or  the  phase  shift  methods. 

In  the  case  of  the  Interpolation  method,  with  two  dimensional  interpolation,  some 
time  and  frequency  resolution  is  lost.  For  example,  suppose  a  weighted  linear  interpolation 
is  used  and  the  signal  is  the  unit  sample  sequence.  Then,  the  result  of  convolving  the  bilin¬ 
ear  form  of  this  signal  with  a  kernel  where  t)r(n,0)  =  1  for  n  =  0  and  0  otherwise  is  one  at 
the  origin  and  zero  elsewhere.  (See  Figure  4.2a.)  After  interpolation,  the  impulse  gets 
spread.  Now,  the  impulse  will  show  in  the  -0.5, 0  and  0.5  columns  of  the  TFD.  Further,  the 
terms  in  the  zero  column  will  decay  with  x  of  increasing  magnitude  rather  than  being  a 
constant  as  it  should.  (See  Figure  4.2b.)  This  is  the  worst  case  example  for  the  interpola¬ 
tion  method  and  demonstrates  that  the  localization  of  a  feature  in  the  time-frequency  plane 
can  decrease  by  as  much  50  percent.  Localization  performance  could  be  improved  by  us¬ 
ing  more  sophisticated  types  of  interpolation.  However,  this  comes  with  increased  compu¬ 
tational  complexity. 

The  phase  shift  method  does  not  require  the  continuous  form  of  the  kernel  in  either 
the  ambiguity  or  time-lag  plane;  however,  this  method  is  only  approximately  equal  to  the 
shifted  ^n,k)  found  by  (4.5)  and  is  dependent  upon  the  size  of  N.  As  N  increases,  the  dif¬ 


ference  between  (4.5)  and  (4.10)  decreases.  In  practice,  the  transform  relationship  given  in 
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Figure  4.3.  Multicomponent  Signal  Consisting  of  Two  Tones,  Two  Impulses 
and  a  Chirp. 


(4.9)  is  only  approximate  with  an  error  that  decays  as  \/N,  but  even  for  N  as  low  as  128, 
the  effect  on  the  TFD  is  minor.  See  Table  4.1  at  the  end  of  this  section  for  a  side-by-side 
comparison  of  the  three  methods. 

To  demonstrate  the  differences  between  the  methods  discussed  above,  examine  the 
result  of  applying  each  method  to  the  Butterworth  kernel  given  by  (4.5)  in  the  GDTFD  of 
a  multicomponent  signal.  The  signal  consists  of  two  tones,  one  at  0. 1367t  and  the  other  at 
0.7777C,  two  impulses  at  r  =  0.25  and  t  =  0.625,  and  a  rising  chirp  starting  at  0. 1 871  with  a 
slope  of  0.5627C.  It  is  depicted  in  Figure  4.3. 

If,  in  (4.12),  0j  =  Tj  =  10.0  and  Af  =  A^=  1,  by  the  CK  method,  the  kernel  in  the  time- 
lag  plane  becomes 


Vf(t,k) 


lOOTt  -mt/k 
— ; — e 


(4.14) 


If  this  is  sampled  appropriately  and  the  inverse  DFT  is  taken  as  per  (4.5),  the  kernel  which 
results  is  shown  in  Figure  4.4.  Compare  this  result  to  the  kernel  generated  by  the  shifted 
kernel  calculated  via  the  phase  shift  method.  Here,  the  kernel  is  calculated  in  the  ambigu¬ 
ity  plane  by  sampling  (4.5)  and  applying  a  phase  shift  as  given  by  (4.7).  (See  Figure  4.4.) 


Figure  4.4.  (a)  Comparison  of  Discrete  Butterworth  Kernel  Sampled  in  Time-Lag 
and  (b)  Kernel  Sampled  in  Ambiguity  Plane  and  Fourier  transformed. 

Although  the  continuous  forms  of  the  Butterworth  kernel  in  the  time-lag  plane  and 
ambiguity  plane  are  equivalent  when  calculating  the  continuous  TFD,  when  they  are  sam¬ 
pled  and  windowed  that  relationship  no  longer  holds.  Hence,  the  TFD’s  shown  in  Figure 
4.5a  and  Figure  4.5c  are  actually  two  different  TFD’s  which  will  approach  each  other  as 
the  size  of  the  window  and  the  density  of  the  sampling  increase. 

The  Interpolation  method  produces  a  TFD  shown  in  Figure  4.5b.  In  this  case,  a 
weighted  linear  interpolation  based  on  the  four  nearest  neighbors  in  the  time-lag  domain  is 
used.  Note  the  interpolation  is  performed  after  the  kernel  and  signal  have  been  convolved 
but  prior  to  taking  the  Fourier  transform  in  the  lag  direction.  As  expected,  the  tones  and 
impulses  are  more  difficult  to  detect  visually  and  are  somewhat  less  localized  in  the  Inter¬ 
polated  GDTFD  than  they  are  in  either  the  Continuous  Kernel  GDTFD  or  phase  shifted 
GDTFD.The  difference  between  the  GDTFD’s  created  using  the  Continuous  Kernel  meth¬ 
od  and  the  Phase  Shift  method  are  the  result  of  the  kernels  not  being  equivalent  (i.e.  they 
will  not  produce  they  same  distribution  since  they  are  not  the  same  themselves). 


Figure  4.5.  GDTFD  Using  the  Butterworth  Kernel  via  (a)  the  Continuous 
Kernel  Method,  (b)  Interpolation  Method  and  (c)  Phase  Shift  Method. 


4.5.  Further  Examples. 

To  demonstrat?  the  versatility  of  the  methods  presented,  they  were  also  applied  to 
the  Binomial  Reduced  Interference  Kernel  developed  by  W.  J.  Williams  and  J.  Jeong  [56]. 
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Figure  4.6.  GDTFD  Using  the  Binomiai  Kernel  via  (a)  the  Continuous 
Kernel  Method  (b)  the  Interpolation  Method  and  (c)  the  Phase  Shift  Method. 

The  Binomiai  kernel  provides  a  good  trade-off  between  cross-term  suppression  and  auto¬ 
term  spreading.  In  the  time-lag  plane,  the  sampled  form  is 


and  the  ambiguity  plane  sampled  form  is 


(4.15) 
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Table  4. 1 :  Comparison  of  Alias-Free  Methods 


Ctnnparison 

Criteria 

Methods 

L 

Continuous  Kernel 

*  Interpolation 

Phase  Shift 

Requires  Continu¬ 
ous  Form  of  Ker¬ 
nel  in  Time-Lag 
Domain 

Yes 

^  No 

No 

Accuracy 

Generates  the  correct 
result  for  the  linear  con¬ 
volution  if  aliasing  is 
avoided 

Suffers  from  low  pass 
filter  effects  which  blur 
features  in  Time-Fre¬ 
quency  plane 

Encounters  some 
error  (very  minor)  in 
calculating  the  equiv¬ 
alent  of  the  phase 
shifted  linear  cmivolu- 
tion 

Cost  of  Computing 

OiN^logN) 

0(N^  log  N)  process 
that  requires  additional 
computation  for  inter¬ 
polation 

(e.g.  linear  interpolation 
requires  an  additional 
0(N^)  computations) 

O(N^l0gN) 

IVnly  Alias-Free? 

Yes,  except  for  Discrete 
Wigner  Dfatrihntion. 

No.  Aliasing  is  reduced 
but  dependent  upon 
interpolation  method. 

Yes 

Ambiguity  Kernel 
Ykifis  Same  Distri¬ 
bution  as  Time- 
Lag  Kemd 

No  j 

Yes 

Yes 

<|>(n,*)  =  “f’ "’’(f "  O’ 


Figure  4.6  shows  the  GDTFD  generated  using  the  Continuous  Kernel,  Interpolation  and 
Phase  Shift  methods. 

As  a  final  example,  consider  the  case  of  the  Wigner  distribution.  Suppose  the  three 
methods  were  applied  to  the  Wigner  distribution.  Is  it  possible  to  generate  an  alias-free 
Wigner  distribution?  The  answer  is  yes,  but  only  for  the  Interpolation  and  Phase  Shift 
methods.  The  CK  method  produces  a  standard  aliased  Discrete  Wigner  Distribution 
(DWD).  This  is  the  result  of  the  hexagonal  sampling  yielding  only  zero  terms  for  the  odd 
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Figure  4.7.  GDTFD  Wigner  of  a  Simple  Chirp  via  (a)  the  Continuous  Kernel  (not 
Alias-Free)  Method  (b)  the  Interpolation  Method  and  (c)  the  Phase  Shift  Method. 

rows  of  the  kernel;  hence,  the  result  of  the  convolution  has  non-zero  values  only  at  every 
other  row.  (The  same  as  if  each  column  is  up-sampled.)  Thus,  the  result  is  expected  for  an 
up-sampled  signal  when  the  Fourier  transform  of  the  columns  is  taken — aliasing.  (See 
Figure  4.7.)  In  [29],  Jeong  and  Williams  propose  a  solution  to  this  problem  by  defining  the 
alias-free  DWD  (AF-DWD)  with  the  kernel 
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k  even 


(4.17) 


/  8(0 

However,  this  is  only  an  approximation  to  a  alias-free  Wigner.  It  imposes  a  two  point  mov¬ 
ing  average  filter,  but  that  is  not  a  Wigner.  In  fact,  the  AF-DWD  using  this  kernel  does  con¬ 
tain  aliasing  which  can  be  seen  as  a  small  ridge  running  from  the  upper  left  comer  toward 
the  lower  right.  Cross-terms  are  evident  along  the  top  and  bottom  of  the  figure.  The  cross- 
terms  in  the  upper  right  and  lower  left  comers  are  the  result  of  the  interaction  of  the  posi¬ 
tive  (0  to  7c)  frequencies  and  the  negative  (-ic  to  0)  frequencies.  The  cross-terms  in  the  up¬ 
per  left  and  lower  right  comers  are  the  result  of  the  interactions  of  the  positive  and  nega¬ 
tive  frequency  aliasing  terms.  The  curved  ridges  seen  in  the  center  are  cross-terms  which 
result  from  the  rectangular  window  which  was  applied  to  the  chirp  prior  to  calculating  the 
distribution.  (See  Figure  4,8a.) 

Consider  the  Phase  Shift  method.  Define  the  AF-DWD  by  (4. 1 0)  where  ^(n,k)  =  1 , 
then  the  corresponding  kernel  in  the  time-lag  domain  is  defined  by 

8(r)  keven 

\|/(r,  k)  =  ■  *  -jnn(2t-\)/N  ,  (4.18) 

^  e  k  odd. 

n  =  -N/2 

Figure  4.9  shows  the  impulse  response  of  the  odd  rows  for  the  case  iV  =  128.  As  can  be 
seen,  its  central  peak  is  similar  to  the  kernel  in  (4.17);  however,  other  than  that,  all  similar¬ 
ity  is  lost.  Significantly,  the  AF-DWD  calculated  this  way  exhibits  no  aliasing.  (See  Figure 
4.8b.) 

The  phase  shift  method  produces  a  traly  AF-DWD.  It  is  a  consistent  method  between 
odd  and  even  rows.  Furthermore,  it  generates  the  desired  delta  function  for  the  even  rows. 
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As  a  result,  it  is  claimed  that  this  is  a  more  accurate  description  of  the  kernel  for  the  alias- 
free  DWD  and  should  be  used  rather  than  (4.17). 

In  the  interpolation  method,  aliasing  has  not  been  completely  avoided,  but  is  present 
as  a  side-effect  of  the  interpolation  algorithm  used;  however,  the  peak  magnitude  of  the 
aliasing  terms  are  between  10  and  20  dB  down  relative  to  the  primary  signal. 

The  phase  shift  method  produces  an  AF-DWD,  but  it  does  have  cross  terms  present. 
The  cross  terms  are  seen  in  the  lower  left  and  upper  right  comers  of  Figure  4.8b  and  are  a 
result  of  the  interaction  of  the  positive  and  negative  frequencies  (only  the  positive  frequen¬ 
cy  is  shown).  These  are  expected  when  dealing  with  a  real  signal.  The  curved  ridges  in  the 
center  are  also  expected  and  are  the  result  of  the  rectangular  window  which  was  applied  to 
the  chirp  prior  to  calculating  the  AF-DWD. 

4.6.  Conclusion 

In  this  chapter,  a  method  was  presented  to  reduce  the  computations  necessary  to  cal¬ 
culate  the  GDTFD  from  O(Af^)  to  0{N^  log  N)  by  using  circular  rather  than  linear  convolu¬ 
tion  via  FFT.  Three  methods  were  presented  for  this  algorithm  to  move  kernels  between 
the  ambiguity  and  time-lag  domains. 

The  Continuous  Kernel  method  is  the  most  accurate  of  the  three  methods  in  deter¬ 
mining  the  outcome  of  the  interpolatic  ever,  it  has  the  major  drawback  of  requiring 
the  continuous  form  of  the  kernel  in  the  time-lag  plane.  Also,  since  the  sampled  \|r(t,T)  is 
not  necessarily  the  DFT  of  the  sampled  (|)(0,t),  kernels  may  inadvertently  be  changed.  In 
addition,  it  is  often  impractical  or  impossible  to  obtain  the  continuous  form  of  the  kernel. 
The  CK  method,  therefore,  has  more  restricted  application  than  the  other  methods. 

The  Interpolation  method  has  problems  in  regard  to  loss  of  localization  in  time  as 
well  as  imposing  extra  low  pass  filtering  along  the  frequency  axis.  Also,  it  requires  an  ad¬ 
ditional  0(N^)  computations  as  a  minimum  to  implement.  Selection  of  the  interpolation 
algorithm  used  has  a  significant  impact  on  the  performance  of  the  Interpolation  method.  In 
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Figure  4.9.  The  Value  of  the  Odd  Rows  of  the  Time-Lag  Kernel 
of  Alias-Free  DWD  (i.e.  the  Impulse  Response  of  the  Filter 
Implemented  by  the  Odd  Rows  of  the  Time-Lag  Kernel. 

this  chapter,  the  results  are  based  on  a  linear  interpolation  of  the  four  nearest  neighbors. 
While  this  is  a  fast  interpolation  method,  it  does  introduce  some  aliasing  into  the  final 
TFD. 

On  the  positive  side,  the  Interpolation  method  does  give  a  TFD  which  is  largely 
alias-free  up  to  the  Nyquist  frequency  (meaning  the  aliasing  effects  are  attenuated),  and  it 
has  twice  the  resolution  in  time  as  the  other  two  methods.  It  also  has  the  strength  of  not 
needing  the  continuous  form  of  the  kernel  in  either  the  ambiguity  or  time-lag  planes. 

The  most  robust  of  the  three  methods  is  the  Phase  Shift  method.  It  yields  a  reason¬ 
ably  accurate  estimate  of  the  shifted  <j)(«,k),  providing  a  TFD  very  similar  to  that  produced 
by  the  CK  method.  Because  it  does  not  suffer  the  problems  associated  with  the  Interpola¬ 
tion  method  and  does  not  require  the  continuous  form  of  the  kernel  in  the  ambiguity  or 
time-lag  planes,  this  is  the  most  versatile  of  the  three  methods  for  calculating  the  alias-free 
form  of  a  kernel  designed  in  the  ambiguity  plane.  It  even  allows  the  calculation  of  an  alias- 
free  Wigner  distribution. 
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With  the  tools  developed  in  this  chapter  it  is  now  possible  to  easily  use  a  kernel  de¬ 
veloped  in  the  ambiguity  plane  or  in  the  time-lag  plane  and  still  have  an  alias-free  kernel. 
In  the  following  chapters,  fixed  kernels  defined  in  the  time-lag  plane  are  used  exclusively 
to  develop  first  decimated  Time-Frequency  Distribution  in  Chapter  S  and  finally  multirate 
Time-Frequency  Distrilnitions  in  Chapter  6,  but  these  developments  are  predicated  on  the 
fact  that  it  is  now  possible  to  easily  obtain  the  alias-free  form  of  a  kernel  in  the  time-lag 
domain  no  matter  where  the  kernel  was  designed. 
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5.1.  Introduction 

A  new  £)ecimated  Generalized  Discrete  Time-Frequency  Distribution  is  developed 
using  the  kernel  design  techniques  of  Chapter  4  combined  with  the  Zak  transform  and 
Weighted  Spectrograms.  These  resulting  decimated  distributions  will  be  used  as  building 
blocks  to  create  MRTFD  in  Chapter  6. 

In  this  chapter,  it  is  shown  that  the  discrete  Zak  transform  is,  with  slight  modifica¬ 
tion,  a  generalization  of  the  STFT.  The  modification  made  to  the  Zak  transform  creates  a 
new  transform  called  the  Windowed  Zak  Transform  (WZT).  The  squared  magnitude  of  the 
WZT  is  a  generalization  of  the  spectrogram  which  is  called  the  Zak-Spectrogram  (ZS). 
The  connection  between  the  Zak  transform  and  the  STFT  makes  it  possible  to  create  fast 
spectrograms  which  trade  bandwidth  for  speed  while  maintaining  the  same  frequency  res¬ 
olution.  The  link  between  the  spectrogram  and  2^  hints  at  a  more  interesting,  new,  re¬ 
sult — D-GD  l  FD.  A  D-GDTFD  generates  a  distribution  which  has  the  same  trade-oflfs 
seen  between  the  ZS  and  the  spectrogram  (i.e.  bandwidth  for  speed).  Implementation  of 
the  D-GDTFD  is  done  via  a  modification  of  the  weighted  spectrograms  method. 

The  idea  of  D-GDTFD  and  its  companion  multirate  implementation  builds  upon  sev¬ 
eral  different  disciplines.  It  ties  together  a  numerical  method  for  calculating  Gabor  trans¬ 
forms  (the  Zak  transform),  Time-Frequency  Distributions,  Weighted  Spectrograms  and 
Multirate.  Figure  5.1  shows  a  block  diagram  of  the  interrelationships  between  these  fields 
and  how  they  are  combined  to  achieve  a  multirate  D-GDTFD. 

This  chapter  is  divided  into  five  sections.  First,  background  material  on  the  Win¬ 
dowed  Zak  transform,  spectrogram,  weighted  spectrograms  and  Time-Frequency  Distribu¬ 
tions  is  given.  Next,  the  Windowed  Zak  Transform  is  used  to  develop  a  new  generalization 
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Figure  5.1.  Interrelation  of  Logical  Elements  Discussed  in  This  Chapter. 


of  the  spectrogram,  called  the  ZS.  Then,  arbitrary  time-frequency  distributions  are  charac¬ 
terized  and  extended  using  ZS’s.  After  this,  a  possible  multirate  implementation  of  the  D- 
GDTFD  is  presented.  In  the  final  section,  an  example  is  given  to  demonstrate  the  relation¬ 
ship  between  GDTFD,  weighted  spectrograms  and  the  D-GDTFD. 


5.2.  Background 

5.2. 1 .  Windowed  Zak  Transform.  The  Zak  transform  was  originally  developed  in 
the  field  of  solid  state  physics  [57].  Since  that  time  it  has  been  used  to  solve  various  prob¬ 
lems  in  mathematics  and  physics.  In  the  area  of  signal  processing,  the  Zak  transform  has 
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been  used  of  determining  coefficients  in  Gabor  transforms  [6],  but  the  physical  meaning  of 
the  resulting  coefficients  have  been  called  into  question  [41]. 


The  Zak  transform  has  been  largely  overlooked  by  the  Time-Frequency  Distribution 
(TFD)  community,  except  for  passing  remarks  that  it  is  possible  to  produce  a  Wigner  dis¬ 
tribution  (WD)  using  the  Zak  [28];  however,  the  problems  associated  with  creating  the  dis¬ 
crete  WD  (DWD) — such  as  finite  window  effects,  non-rectangular  sampling  of  bilinear 
signal  and  aliasing  (see,  for  example.  Chapter  4) — were  not  addressed. 

In  this  section,  the  definition  of  the  Zak  transform  given  in  Chapter  2  is  expanded. 
The  existing  definition  assumes  a  rectangularly  shaped  window.  A  natural  extension  to  this 
definition  of  the  Zak  transform  allows  for  arbitrary  windows  and  is  called  the  Windowed 
Zak  transform.  Using  the  multirate  implementation  of  the  STFT  given  by  Vaidyanathan 
[50],  a  multirate  form  of  the  Windowed  Zak  transform  is  presented. 

5.2. 1.1.  Definition.  The  basic  Zak  transforms  relies  on  the  STFT  using  a 
rectangular  window.  Now,  let  the  definition  of  the  Zak  transform  to  be  broadened  to  in¬ 
clude  a  non-rectangular  window. 

The  definition  of  the  STFT  of  some  signal,/,  is  [39] 

oe  oo 

Tf  it,  ©)  =  I  fix)  h{x-  0  =  j  /  ix +  0  ft  (T)  (5.1) 

--O0  —oo 

Letting  r  =  np,  ©  =  k/N  and  x  =  /p,  /,  n  e  Z,  the  discrete  STFT  is 

Tjr  in,k)  =  X  /  ( [n  +  /]  P)  ft  Up)  (5.2) 

1  =  0 

Without  much  effort,  (5.2)  can  be  rewritten  as 

L~\ 

WZfin,k)  =  J^fi[lm  +  n]p)h{lmp)e~^^''^''^^  (5.3) 

1  =  0 

where  5=  Lm.  As  can  be  seen,  (5.3)  is  a  generalization  of  the  Zak  transform  which  now 


Figure  5.2.  Block  Diagram  of  Multirate  Implementation  of  STFT. 


includes  an  arbitrary  window;  thus,  for  h{t)  =  rect(t),  Z^njc)  =  H^/i,/t).This  formulation  is 
called  the  Windowed  Zak  Transform  (WZT). 

5.2. 1.2.  Multirate  Interpretation.  It  is  possible  to  consider  the  WZT  in  a 
multirate  sense.  This  has  an  advantage  of  providing  a  direct  means  of  implementation 
which  reduces  the  required  component  speed  for  a  given  throughput.  It  provides  an  alter¬ 
nate  interpretation  both  conceptually  and  mathematically  for  analysis  of  the  Zak  transform 
and  W27r  as  nothing  more  than  the  result  of  decimation  and  filtering  operations,  and  as 
will  be  seen  ins  section  5.4,  it  can  be  expanded  to  handle  any  discrete  Time-Frequency 
Distribution. 

Since  both  the  Zak  transform  and  WZT  are  based  upon  the  discrete  STFT,  a  multi¬ 
rate  implementation  of  the  STFT  is  examined  first.  In  [50],  Vaidyanathan  presents  a  block 
diagram  of  the  multirate  implementation  of  the  STFT  (See  Figure  5.2.).  The  multirate 
STFT  takes  the  input  signal,/,  passes  it  through  a  delay  chain  such  that  at  time  n  =  N, 
y(Np)  is  at  a,fi\N  -  l]p)  is  at  b  andy(p)  is  at  c.  The  delay  chain  is  followed  by  a  bank  of  N 
fold  decimators  which  pass  only  every  sample  to  the  filters  The  filters  imple¬ 
ment  the  window  function  in  the  STFT.  For  example,  the  filter  .  jt  -  i(z)  is  just  a  multi¬ 
plier  whose  value  is  /i(kp).  The  block  is  the  discrete  Fourier  transform  coefficient 
matrix  of  dimension  NxN;  thus,  at  every  clock  cycle  the  system  will  output  an  N  point 
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Z^(0,  0) 

z^(0.  1) 

Zf(0,  L-l) 
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Zf{m-\,  1) 


Figure  5.3.  Multii  .le  Block  Diagram  of  Windowed  Zak  Transform. 


Fourier  transform. 

With  the  system  shown  in  Figure  5.2  as  a  building  block,  it  is  simple  to  extend  the 
multirate  implementation  to  include  the  WZ  transform.  Figure  5.3  shows  the  block  dia¬ 
gram  for  this  implementation.  Now,  instead  of  a  single  STFT  filter  bank,  there  is  a  set  of 
filter  banks  preceded  by  an  additional  delay  chain/decimator  combination.  The  filters  E^(z) 
are  equivalent  between  parallel  STFT  filter  banks  as  are  the  coefficients  in  the  matrix 
.  Thus,  a  highly  parallel  multirate  implementation  of  the  Windowed  Zak  transform 
can  be  formed.  At  every  clock  cycle  this  system  will  output  an  m  x  L  WZ  transform 
(i.e.  m  WZ  transforms  which  are  L  long). 

5.2.2.  Spectrogram.  In  this  section,  the  spectrogram  is  discussed.  The  spectrogram 
can  be  calculated  in  numerous  ways.  Two  are  discussed  here.  The  first  method  for  calculat¬ 
ing  the  spectrogram  is  via  Cohen’s  class  of  TFD  [15].  It  can  be  regarded  as  an  outer  prod¬ 
uct  formulation  and  is  discussed  in  5.2.2. 1.  The  second  method  takes  magnitude  square  of 
the  values  of  the  STFT.  This  can  be  thought  of  as  an  inner  product  formulation,  and  it  is 
discussed  in  5.2.2.2.  In  section  5.2.2.3,  a  means  of  calculating  GDTFD's  based  upon  a 


5.5 


sum  of  weighted  spectrograms  is  given.  This  is  called  the  method  of  Weighted  Spectro¬ 
grams. 

5.2.2. 1 .  Outer  Product  Formulation.  The  spectrogram  has  been  shown  to  be 
a  time-frequency  representation  by  Classen  and  Mecklenbrauker  in  [15].  In  Appendix  A,  it 
is  shown  that  in  order  for  (2.2)  (the  outer  product  formulation)  and  the  squared  magnitude 
of  the  STFT  to  produce  the  same  result,  the  time-lag  kernel  and  the  window  in  the  STFT 
must  be  based  upon  a  window  which  is  real  symmetric  and/or  imaginary  anti-symmetric. 
In  other  words,  the  TFD  can  be  written  in  the  form  seen  in  (2.2)  where  the  kernel  is  de¬ 
fined  as 


y(«,  t) 


(5.4) 


and  h{t)  is  a  real  symmetric  and/or  imaginary  anti-symmetric  window  function.  This 
means  \|/  is  equivalent  to  a  Hermitian  operator  and  the  resulting  distribution  will  be  real. 

5.2.2.2.  Inner  Product  Formulation.  The  inner  product  formulation  as 
shown  in  section  2.3.2  can  be  derived  from  the  outer  product  by  means  of  a  double  change 
of  variables.  When  the  kernel  is  of  the  form  seen  in  (5.4),  (2.2)  can  be  rewritten  as 

oo  2 

=  J/(x  +  0/i(T)e‘-'"“(/T  (5.5) 

~oo 

which  has  the  discrete  form 


Tfin,k)T/in,k) 


N-\ 


^-j2nlk/N\ 


2 


1  =  0 


(5.6) 


One  advantage  of  the  inner  product  form  over  the  outer  product  form  is  seen  in  the 
discrete  case.  The  outer  product  form  requires  the  kernel  to  be  sampled  on  hexagonal  ly 
decimated  sampling  grid  to  avoid  aliasing  [29].  This  can  cause  some  difficulties  when  de¬ 
signing  kernels  as  was  shown  in  Chapter  4;  however,  the  inner  product  form  is  naturally 
alias-free.  The  sample  points  in  V  (/p  » where  /j  and  t2  are  integers,  correspond  to  the 
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points  in  y\f  when  it  has  been  sampled  on  the  hexagonally  decimated  grid.  Pictorially,  this 
can  be  seen  in  Figure  2.3. 


5.2.2.3.  Arbitrary  Time-Frequencv  Distributions  Using  Weighted  Spectro¬ 
grams.  A  generalized  discrete  TFD  can  be  generated  by  calculating  the  sum  of  weighted 
spectrograms.  The  weighting  factors  are  simply  the  eigenvalues  of  the  kernel  ij/ ,  and  the 
window  functions  in  the  spectrogram  is  the  eigenvector  corresponding  to  the  eigenvalue 
[18].  In  other  words,  (2.7)  can  be  rewritten  as 


N 


-j2n(a(t  + 1^) 


dt^ 


(5.7) 


A:  =  1  l-co  i 

where  is  the  non-zero  eigenvalue  and  is  the  k!**  eigenvector  of  ijif .  With  this  formu¬ 
lation,  any  GDTFD  can  be  written  as  the  sum  of  weighted  spectrograms. 


5.3.  Zak-Spectrogram 

This  section  develops  the  ZS.  First  the  ZS  is  defined  in  5.3.1,  and  its  connection  to 
Time-Frequency  Distributions  is  given  in  5.3.2.  Lastly,  a  multirate  implementation  is  giv¬ 
en  for  calculating  the  ZS  in  5.3.3. 

5.3.1.  Definition  and  Properties.  In  [28],  Jansen  describes  an  ambiguity  function 
based  upon  the  continuous  Zak  transform.  This  work  was  expanded  by  Auslander,  et  al.,  in 
[6]  to  include  the  finite  discrete  Zak  transform.  Auslander,  et  ai,  dealt  with  the  problem  of 
calculating  Gabor  transforms  and  as  such  they  consider  only  the  cross-Zak  transform.  A 
study  of  the  Zak  transform  and  its  relationship  to  TFD’s  has  not  previously  been  per¬ 
formed.  It  should  be  noted  that  Auslander’s  result  [6]  for  the  cross-ambiguity  function  and 

* 

the  work  done  by  Jansen  [28]  lead  to  the  supposition  that  WZ^  (n,  k)  WZj-(n,  k)  would 
result  in  a  TFD. 

♦ 

If,  in  (5.3),  m  =  1,  then  WZ^  (n,  k)  WZj  (n,  k)  is  nothing  more  than  the  discrete 
spectrogram.  For  m  >  1,  (5.3)  becomes 
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WZy(n,k)  WZ*^(n,  k) 


L-l 

f[(lm  +  n)p]h[lmp] 


2 


^-j2nlk/L\ 


1  =  0 


(5.8) 


which  can  be  interpreted  as  a  decimated  spectrogram  which  will  be  called  the  ZS. 

At  each  time,  n,  the  ZS  is  related  to  the  spectrogram  by  the  relation  of  the  STFT  of  a 
decimated  and  non-decimated  signal.  Since  the  STFT  is  being  used,  it  is  known  that  each 
filter  output  has  a  passband  governed  by  the  relation,  1 /(sampling  period  x  number  of  sam¬ 
ples).  In  the  case  of  the  spectrogram,  the  sampling  period  is  p  and  the  number  of  samples 
is  N.  In  the  ZS,  the  sampling  period  is  pm  and  the  number  of  samples  is  L;  thus,  for  both 
the  spectrogram  and  the  ZS,  the  passband  of  the  individual  filters  is  exactly  the  same,  1/ 
(pLm)  =  l/(pA0-  The  decimation  factor,  m,  instead  of  changing  the  passband  of  the  filters, 
changes  the  bandwidth  of  the  transform.  The  spectrogram  has  a  bandwidth  of  l/2p  while 
the  2^  has  a  bandwidth  of  l/2pm. 

This  is  the  fundamental  relationship  between  the  ZS  and  the  spectrogram.  Band¬ 
width  is  being  traded  for  speed  without  reducing  resolution. 

5.3.2.  Time-Frequency  Interpretation.  From  Appendix  A,  it  is  known  that  for  h 
symmetric  or  anti-symmetric,  there  is  a  corresponding  outer  product  form  of  (5.8)  which 
is  given  by 


WZ^(n.k)Wz'f(r,,k)  =  £  F/[(“  + Y)p]/*[("-y)p] 


1  =  0  « 


-jlnlk/L 


(5.9) 


This  is  nothing  more  than  a  GDTFD  with  a  kernel  defined  as 

v*(«,/)  =  y)p]  **[('*  ~y)p]’  ^5.10) 

and  a  signal  decimated  by  m  at  each  time  interval. 

There  is,  therefore,  justification  for  calling  the  ZS  a  TFD.  Specifically,  the  ZS  is  a 
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Figure  5.4.  Multirate  Block  Diagram  of  Zak-Spectrogram 

member  of  a  new  general  class  of  TFD’s  which  is  called  the  Decimated  Time-Frequency 
Distributions. 

5.3.3.  Multirate  Interpretation.  The  ZS  can  be  implemented  using  the  multirate 
techniques  discussed  earlier.  By  a  simple  extension  of  Figure  5.3,  the  multirate  implemen¬ 
tation  of  the  ZS  can  be  obtained.  The  modification  adds  a  time  varying  filter  at  the  output 
of  the  WZT.  The  filter  is  a  one  point  filter  whose  coefficient  is  dependent  upon  the  input 
such  that  the  real  input  equals  the  real  coefficient  and  the  imaginary  input  equals  the  nega¬ 
tive  of  the  imaginary  coefficient.  The  output  at  each  time,  n,  is  then  the  ZS.  See  Figure  5.4. 

Using  this  multirate  implementation  reduces  the  required  speed  of  the  components 
of  the  system  by  a  factor  of  m  less  than  a  multirate  spectrogram.  Further  it  reduces  the 
storage  requirement  for  the  coefficients  from  N  filter  coefficient  to  L  and  the  number  of 
complex  exponential  coefficients  in  the  discrete  Fourier  transform  from NxNtoLxL. 
Overall,  the  ZS  requires  +  m  less  storage  than  the  spectrogram. 
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This  section  builds  upon  the  idea  of  the  ZS  and  the  weighted  spectrogram  to  produce 
a  Decimated  Time-Frequency  Distribution.  Two  ways  the  ZS  can  be  extended  to  cover  all 
of  Cohen’s  class  using  the  weighted  spectrogram  method  are  by  Eigenvalue  Decomposi¬ 
tion  [2][3][18]  (covered  in  5.4.1)  and  by  Singular  Value  Decomposition  (SVD)  (covered  in 
5.4.2).  Both  methods  can  be  used  to  extend  the  ZS  concept  to  the  GDTFD.  This  extension 
will  be  called  the  Decimated  GDTFD  (D-GDTFD).  The  next  subsection  discusses  the  sim¬ 
ilarities  and  differences  between  the  Eigenvalue  and  Singular  Value  Decomposition  meth¬ 
ods  (section  5.4.3). 

5.4.1.  Extension  of  Zak-Spectrogram  via  Eigenvalue  Decomposition.  The  eigen¬ 
values  and  eigenvectors  of  the  discrete  operator,  tj/,  can  be  used  to  calculate  the  GDTFD 
by  means  of  the  discrete  form  of  (5.7), 


Cf{n,k\\sf)  = 

/=  1 


N-\ 


j2nkl\ 


/  =  0 


(5.11) 


where  N  is  the  dimension  of  the  operator  (i.e.  y  hNxN),  r<N,  ris  the  number  of  non¬ 
zero  eigenvalues,  X,  is  the  eigenvalue  and  jt,  is  the  corresponding  eigenvector.  Letting  N 
=  Lm  and  applying  (5.2)  to  (5. 1 1),  the  formula  for  the  Decimated  GDTFD  results. 


CAn,k-,y^)  = 


1  =  1 


L-  1 


j2nlk\ 


Y,f[Om  +  n)p]x.*{lmp)e  ^ 


/  =  o 


(5.12) 


This  yields  a  TFD  which  is  decimated  by  a  factor  of  m  and  has  a  digital  bandwidth  of  ±7c/ 
m  vice  ±7C  of  the  normal  alias-free  GDTFD. 

5.4.2.  Extension  of  Zak-Spectrogram  via  Singular  Value  Decomposition.  One 
problem  with  the  Eigenvalue  E)ecomposition  method  is,  in  general,  eigenvalues  and  eigen¬ 
vectors  do  not  provide  very  much  information  about  the  operator.  On  the  other  hand.  Sin¬ 
gular  Value  Decomposition  (SVD)  does.  The  SVD  method  is  based  upon  that  proposed  by 
White  [54]  and  Amin  [3]  for  use  in  creating  approximate  kernels.  This  section  extends  the 
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application  of  the  SVD  to  D-GDTFD.  It  should  also  be  noted  that  there  has  recently  been 
interest  in  designing  and  using  asymmetric  kernels  [5}  for  which  the  SVD  is  ideally  suited. 

A  well  known  property  of  the  SVD  is  its  decomposition  of  any  L  x  m  rectangular  op¬ 
erator  into  an  L  X  L  orthogonal  matrix,  (/,  an  L  x  m  diagonal  matrix,  Z,  and  an  m  x  m  or¬ 
thogonal  matrix,  V.  The  product  of  these  matrices  equals  the  original  matrix,  i.e.  A  =  ITLV^ 
if  indicates  complex  conjugate  transpose).  This  may  also  be  written  as 

r 

y=  I 

where  r  is  the  rank  of  A,  Gj  is  the  singular  value  corresponding  to  uj  e  is  the  /* 
column  vector  of  U  and  vj  e  SR"*  is  the  column  vector  of  V.  (For  more  information,  see, 
for  example,  [24].) 

Starting  with  a  general  form  of  the  inner  product  operator,  (\j//,  f) ,  take  the  SVD 
and  recast  the  equation  as  follows 

r 

<v/./)  =  <  X 

)=i 

r 

>='  .  (5.14) 

Now,  add  the  time  shift  and  frequency  shift  operators  defined  in  (2.8). 
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=  Cf{n,km 

Equation  (5.15)  provides  a  new  general  formulation  for  the  GDTFD  which  depends  upon 
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the  product  of  the  STFT  using  two  potentially  different  window  functions.  In  the  degener¬ 
ate  case  of  Uj  =  v,'  or  more  generally,  for  the  Hermitian  kernels  discussed  in  Section  5.4.3, 
the  weighted  spectrogram  results. 


The  advantages  of  tne  SVD  TFD  are:  (1)  it  is  easy  to  obtain  the  condition  number, 
(2)  it  is  easy  to  create  an  approximate  operator  for  which  the  condition  number  is  automat¬ 
ically  known  and  (3)  the  SVD  based  TFD  opens  up  a  whole  new  arena  of  approximate  op¬ 
erators  which  in  turn  provides  insight  into  the  characteristics  of  the  original  operators. 

Extending  the  ZS  via  the  SVD  TFD  is  just  as  easy  as  it  was  via  the  Eigenvalue  De¬ 
composition.  By  performing  the  same  substitution,  the  D-GDTFD  via  the  SVD  TFD  is 

r  fZ,  - 1 

Cf(n,k;^)  =  a,.  J^f(lm)v.*ilm-n)e  ^ 

i  =  1  L/  =  0 


L-\ 
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jlnlk- 
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(5.16) 


5.4.3.  Relationship  of  Eigenvalue  and  Singular  Value  Decomposition.  If  the  class 
of  kernels  is  restricted  to  those  which  are  Hermitian  about  the  lag  axis  in  the  time-lag  do¬ 
main,  i.e.  \|/(r,i:)  =  \|r*(-Z,T),  Singular  Value  Decomposition  becomes  Eigenvalue  Decompo¬ 
sition.  A  brief  explanation  of  this  important  relationship  follows. 

A  kernel  which  is  real  symmetric  and/or  imaginary  anti-symmetric  about  the  lag  axis 
has  an  inner  product  representation  which  is  Hermitian  such  that  A  =A^.  For  a  Hermitian 
matrix.  A,  it  is  known  that  there  exists  a  unitary  operator,  Q,  (i.e.  a  complex  matrix  which 
has  orthonormal  column  vectors)  and  a  real  diagonal  operator.  A,  such  that  A  =  Q\Q^ 
[48].  The  diagonal  elements  of  A  are  the  eigenvalues  (always  real)  of  A  and  the  columns  of 
Q  are  the  corresponding  eigenvectors. 

A  being  Hermitian  implies  AA^  =A^A  =:A^.  Since  A  can  be  rewritten  by  a  similarity 
transform,  A^  =  QAl'Q^.  The  values  A^  are  then  the  eigenvalues  of  AA^  and A^A.  The 
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eigenvalues  of  AA^  and  A^A  are  the  square  of  the  singular  values,  o,,  of  A  [24].  From  this, 
it  can  be  seen  that  for  Hermitian  operators,  <s.  =  JX^j  is  always  the  result  [48].  This  im¬ 
plies  in  this  case  that  all  of  the  information  obtainable  from  the  SVD  can  be  interpreted  di¬ 
rectly  from  the  Eigenvalue  Decomposition. 

The  relationship  between  the  singular  values  and  eigenvalues  implies  a  tie  between 
the  columns  of  the  orthonormal  operators  U  and  V  in  the  SVD  (where  A  =  IfLV^)  and  the 
eigenvectors  in  Q.  Let  V=Q,  and  define  a  new  diagonal  operator,  S,  such  that  =  signiXj). 
Then,  A  =  VSXV^.  Since  V5  is  also  orthonormal  and  spans  the  same  space  as  V  (which  is 
the  same  as  U  since  AA^  =  A^A),  it  is  possible  to  define  U  =  VS;  thus,  the  Singular  Value 
Decomposition,  A  =  ULV^  where  <j.  =  |Xj-|  ,V=Q  and  U  =  QS,  can  be  derived  from  the 
Eigenvalue  Decomposition.  Therefore,  for  kernels  real  symmetric  and/or  imaginary  anti¬ 
symmetric  about  the  lag  axis  (i.e.  \lf(t,i:)  =  (5.15)  can  be  simplified  to 

r  /V-1  J2^2 

Cf{n,k-ysf)  =  X/(0v,.*(/-n)c  ^  (5.17) 

1=1  /  =  0 

Cfn,k\^)  in  (5.17)  will  always  be  real-valued;  however,  this  is  not  necessarily  true  for 
C^n,k;t|/)  in  (5. 15).  It  is  therefore  a  sufficient  condition  for  real-valuedness  of  C^n,k,^)  if 
the  operator  is  Hermitian.  Further,  if  si  is  restricted  to  positive  values  only,  the  resulting 
distribution  will  always  be  positive  [18]. 

Thus,  for  the  case  of  a  ^eal  distribution  where  the  kernel  must  be  Hermitian,  the  Sin¬ 
gular  Value  Decomposition  and  the  Eigenvalue  Decomposition  provide  the  same  informa¬ 
tion. 

5.5.  Implementation  Considerations 

In  this  section,  some  of  the  details  involved  in  implementing  a  decimated  TFD  are 
discussed.  In  subsection  5.5.1. ,  a  decimated  kernel  is  developed  which  could  be  used  to 
implement  the  decimated  TFD  in  the  same  fashion  as  any  other  distribution.  In  subsection 
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Figure  5.5a.  The  Original  Binomial  Figure  5.5b.  The  Decimated  Binomial 

Kernel  Kernel 


5.5.2. ,  a  multirate  implementation  is  proposed  which  takes  advantage  of  the  decimated 
structure  of  the  ZS  using  a  weighted  ZS  approach. 

5.5.1.  Decimated  and  Non-Decimated  Kernels.  Consider  Equation  (5.16).  A  new 
kernel  can  be  created  using  (5.16)  via  (5.13).  The  new  kernel  can  be  thought  of  as  a  deci¬ 
mated  and  upsampled  version  of  the  original  kernel.  This  follows  from  the  decimation  per¬ 
formed  in  (5.16). 

There  are  two  ways  to  look  at  the  kernel  created  by  (5.16).  First,  a  decimated  kernel 
operating  on  a  decimated  signal.  For  example,  suppose  the  level  of  decimation  is  four  (i.e. 
m  =  4).  Then,  every  fourth  value  in  the  and  t2  directions  in  the  kernel  is  taken  and 
formed  into  a  new  kernel.  If  the  original  kernel  was  NxN,  the  new  kernel  is  M4  x  N/4. 
This  kernel  is  then  applied  to  the  decimated  signal  which  also  must  be  decimated  by  four 
and  has  length  N/4. 

The  second  way  to  look  at  the  new  kernel  is  to  consider  (5.16)  as  having  the  window 
functions  v  and  u  which  have  been  decimated  and  then  upsampled.  The  dimension  of  the 
new  kernel  is  the  same  as  the  original  and  the  signal  need  not  be  decimated.  It  does  cause 
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Figure  5.6.  The  First  25  Singular  Values  for  the  Normal  Binomial  Kernel  and  the 
D^imated  Binomial  Kernel. 


the  result  of  (5. 16)  to  be  periodic  with  period  2ic/m  rather  than  2n,  but  in  practice,  the  m  - 
1  periodic  images  need  not  be  calculated. 

By  this  second  method,  a  new  kernel  is  created  which  can  operate  on  the  same  signal 
as  the  original  kernel,  or  put  another  way  it  operates  in  the  same  dimension  I2  space  as  the 
original;  therefore,  this  kernel  is  considered  the  Decimated  GDTFD  kernel.  In  Figure  5.5a, 
an  example  of  an  original  kernel  is  shown.  In  this  case,  it  is  the  Binomial  kernel  [56].  In 
Figure  5.5b,  the  new  kernel  is  shown.  As  can  be  seen,  it  is  considerably  more  sparse  than 
the  original  containing  1/16'*  as  many  non-zero  terms,  and  it  is  the  Decimated  GDTFD 
kernel. 

While  the  original  kernel  and  the  new  kernel  are  obviously  related,  they  are  decided¬ 
ly  non-similar  (in  a  mathematical  sense),  and  hence,  have  different  Eigenvalue  and  Singu¬ 
lar  Value  Decompositions.  This  point  is  driven  home  by  an  analysis  of  the  singular  values 
of  the  examples  in  Figure  6.  If  the  two  kernels  were  similar,  the  singular  values  would  be 
the  same;  however,  as  Figure  5.6  shows,  they  are  not.  The  significance  of  this  is  that  a 
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method  to  create  a  new  set  of  kernels  from  ones  previously  defined  has  been  formed.  The 
new  kernels  are  truly  different,  as  evidenced  by  their  different  singular  values,  from  their 
“parent,”  but  they  behave  the  same  over  the  bandwidth  ±nJm. 

5.5.2.  Possible  Implementation  Strategy.  One  possible  way  of  implementing  the  D- 
GDTFD  is  by  means  of  multirate  techniques.  By  using  multirate  methods,  it  is  possible  to 
define  a  strategy  which  makes  use  of  parallel  design  to  reduce  the  clock  rate  of  the  individ¬ 
ual  computational  elements;  thus,  for  a  given  clock  rate  the  throughput  of  the  system  is 
significantly  higher  than  standard  sequential  design  techniques.  The  implementation  pre¬ 
sented  here  is  based  upon  (5. 16)  and  the  previous  multirate  examples. 

A  multirate  implementation  of  the  D-GDTFD  consists  of  parallel  ZS’s  with  some 
modification.  Each  Windowed  Zak  uses  a  different  set  of  windows  corresponding  to  the 
right  and  left  singular  vectors.  The  WZ  based  on  the  left  singular  vector  is  conjugated  and 
multiplied  by  the  WZ  using  the  right  singular  vectors  and  the  singular  value.  The  result  of 
each  branch  is  summed  to  create  the  final  result.  For  Hermitian  operators,  the  filters 
<u)  =  Eij(z)  are  the  values  vy,  or  of  the  right  or  left  singular  vectors,  respective¬ 

ly,  and  o,  is  the  singular  value  multiplied  by  j,-  from  (5.17).  In  other  words,  for  Hermitian 
operators,  the  window  functions  in  the  STFT  are  identical  and  the  weights  correspond  to 
the  eigenvalues.  Figure  5.7  presents  a  block  diagram  of  one  possible  multirate  implemen¬ 
tation. 

5.6.  Example 

It  is  instructive  to  examine  the  performance  of  the  D-GDTFD  using  weighted  ZS’s 
and  to  compare  it  to  the  GDTFD  created  using  weighted  spectrograms.  The  weighted 
spectrogram  approach  serves  as  our  basis  for  comparison  and  represents  the  current  state- 
of-the-art.  It  must  be  emphasized  that  the  spectrogram  is  a  degenerate  case  of  the  ZS  (i.e. 
the  spectrogram  is  the  ZS  with  a  decimation  factor  of  one).  For  clarity  the  ZS  with  a  deci¬ 
mation  factor  of  one  is  referred  to  as  the  spectrogram  and  as  the  ZS  for  decimation  factors 
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Figure  5.7.  Multirate  Implementation  of  Decimated  Generalized  Discrete 
Time-Frequency  Distribution. 

other  than  one.  In  this  example,  a  comparison  is  made  between  the  TFD  generated  by  the 
weighted  spectrogram  and  TFD  created  by  the  weighted  ZS  with  a  decimation  factor  of 


Four  types  of  distributions  will  be  seen  here:  the  Binomial  GDTFD,  the  approximate 
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Binomial  GDTFD,  the  decimated  Binomial  GDTFD  and  the  approximate  decimated  Bino¬ 
mial  GDTFD.  The  Binomial  GDTFD  is  nothing  more  than  the  classical  form  of  the  alias- 
free  discrete  TFD  using  a  Binomial  kernel  developed  by  Williams  and  Jeong  [56].  The  ap¬ 
proximate  Binomial  GDTFD  is  created  by  first  decomposing  the  Binomial  kernel  via 
SVD.  Then,  the  approximate  TFD  is  calculated  by  means  of  (5.17)  wi.cre  r  is  selected  to 
be  less  than  the  dimension  of  the  kernel.  For  example,  for  r  =  1 ,  only  the  spectrogram 
which  is  generated  by  using  the  singular  vector,  V|,  as  the  window  function  would  be  used 
to  calculate  the  approximate  distribution.  This  singular  vector  is  associated  with  the  largest 
singular  value.  For  r  =  2,  the  previous  spectrogram  multiplied  (or  weighted)  by  the  largest 
singular  value  and  the  spectrogram  created  using  singular  vector  V2  weighted  by  the  sec¬ 
ond  largest  singular  value  are  summed  to  create  an  improved  approximation  to  the  Bino¬ 
mial  GDTFD.  For  successively  better  approximations,  r  is  increased  until  is  zero  or  r 
is  equal  to  the  dimension  of  the  kernel.  At  this  point,  the  distribution  obtained  by  the 
weighted  spectrogram  method  is  no  longer  an  approximation  but  the  true  distribution. 

The  decimated  Binomial  GDTFD  is  calculated  by  using 

r  L-l  _72n/*  2 

C^(/i,k;\|/)  ==  ^  s.a.  2 /(/m)v.*  (/m-«)e  ^  (5.18) 

i  =  1  1  =  0 

where  m  =  4  and  L  =  32.  For  the  true  distribution,  r  is  the  dimension  of  the  kernel  and  N  = 
Lm  =  128.  Equation  (5.18)  is  equivalent  to  calculating  the  distribution  using  the  kernel  in 
Figure  5.5b  where  only  the  frequencies  between  ±7t/4  are  calculated.  The  frequencies 
above  and  below  are  not  needed  since  they  are  periodic  images  of  those  between  ±7c/4.  The 
approximate  decimated  Binomial  GDTFD  is  calculated  in  exactly  the  same  fashion  as  the 
approximate  Binomial  GDTFD  except  (5.18)  is  used  instead  of  (5.17). 

Now,  suppose  there  is  a  bandlimited  signal  consisting  of  a  rising  chirp  and  a  constant 
tone  where  the  chirp  has  twice  the  amplitude  of  the  tone.  Taking  the  GDTFD  of  the  signal 
using  the  Binomial  kernel,  the  resulting  TFD  is  shown  in  Figure  5.8.  This  is,  in  effect,  the 
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Figure  5.8,  Binomial  TFD  of  Bandlimited  Signal 

“approved  solution”  and  is  the  basis  of  comparison  for  this  example.  The  signal  is  band- 
limited  as  can  be  seen.  What  cannot  be  seen  is  small  cross-terms  which  reside  in  the  region 
above  7C/4.  These  terms  tend  to  be  less  than  10~^. 

Next,  calculate  the  approximate  Binomial  and  decimated  Binomial  GDTFD’s.  To 
start,  let  r  =  1.  Then,  the  approximate  Binomial  and  decimated  Binomial  distributions 
would  be  the  spectrogram  and  ZS  based  upon  the  singular  vector  V]  as  the  window  func¬ 
tion.  The  distributions  are  calculated  via  (5.17)  and  (5.18),  respectively.  The  window  func¬ 
tion  is  shown  in  Figure  5.9a.  The  resulting  approximate  Binomial  distribution  is  seen  in 
Figure  5.9b  and  the  approximate  decimated  Binomial  distribution  is  seen  in  Figure  5.9c. 

To  create  a  better  approximation,  additional  weighted  spectrograms  and  weighted 
ZS’s  are  added.  In  this  example,  the  spectrograms  and  ZS’s  using  the  singular  vectors  V2 
and  V3  as  the  window  functions  are  weighted  by  their  respective  singular  values  (the  sec¬ 
ond  and  third  largest)  and  added  to  the  previous  result  (which  is  weighted  by  the  largest 
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Figure  5.9a.  Window  Function  Corresponding  to  the 
Largest  Singular  Value  (i.e.  Singular  Vector  V|). 
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Figure  5.9b.  Approximate  Binomial  GDTFD  Using  One  Weighted 
Spectrogram  with  the  Singular  Vector  vj  Shown  in  Figure  5.9a  as 
the  Window  Function. 
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Figure  5.9c.  Approximate  Decimated  Binomial  GDTFD  Using  One  Weighted 
Zak-Spectrogram  with  the  Singular  Sector  Vj  Shown  in  Figure  5.9a  as  the 
Window  Function. 
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Figure  5. 10c.  Approximate  Binomial  GDTFD  Using  Three  Weighted  Spectrograms 
Based  upon  the  Window  Functions  Seen  in  Figures  5.9a,  5.  ‘  Oa  and  5. 10b. 
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Figure  S.lOd.  Approximate  Decimated  Binomial  GDTFD  Using  Three  Weighted  Zak- 
Spectrograms  Based  upon  the  Window  Functions  Seen  in  Figures  5.9a,  5.10a  and  5.10b. 


singular  value).  Figures  5,1  la  and  5.1  lb  show  the  singular  vectors  Vi  and  V3,  respectively. 
Figures  5.1  Ic  and  5.1  Id  are  the  approximate  Binomial  and  decimated  Binomial  distribu¬ 
tions.  These  distributions  are  the  result  of  r  being  set  equal  to  three  in  (5.17)  and  (5.18). 

As  the  value  of  r  increases  the  closeness  of  the  approximation  will  improve  in  a  L2 
sense  and  in  aL*,  sense  as  well  since  the  and  norms  are  related  by  ||xl|^  <  HxHj  [24]. 
Before  showing  the  final  approximation  in  Figure  5.14,  it  must  first  be  determined  how 
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many  weighted  spectrograms  and  weighted  2^’s  are  necessary  for  a  “good”  approxima¬ 
tion.  To  determine  this,  the  effect  on  the  error  in  the  approximation  of  additional  weighted 
spectrograms  and  weighted  ZS’s  must  be  examined. 

For  this  example,  the  instantaneous  error  is  calculated  over  all  the  computed  fre¬ 
quencies  at  a  given  time,  n, 

2\l/2 

Inst  Lj  error  =  ^2^")  =  -  J  (5.19) 

then  the  average  Li  error  over  the  frame  is  computed  by 

128 

Avg  Lj  error  =  X  ^2  (5-20) 

i=  I 

where  the  frame  is  the  interval  1  <  n  ^  128,  (n,  jk;\|/)  is  a  particular  instant  in  time  of 

the  Binomial  GDTFD  or  the  decimated  Binomial  GDTFD  depending  upon  the  compari¬ 
son  being  made  and  Cy  (n,  is  a  particular  instant  in  time  of  the  approximate  Bi¬ 
nomial  or  decimated  Binomial  GDTFD. 

The  L«  error  is  the  maximum  pointwise  error  over  the  entire  frame  for  all  computed 
frequencies,  i.e. 

error  =  max^  (n,  *;t|r)  -  (n,  |)  (5.21) 

Using  (5.20)  and  (5.21),  the  error  between  the  Binomial  GDTFD  and  the  approxi¬ 
mate  Binomial  GDTFD  is  examined.  This  represents  the  baseline  error  performance  for 
the  existing  technique.  Figure  5. 1 1  shows  the  average  L2  and  error  for  r  =  1 ,. .  .,40.  It  al¬ 
so  shows  the  maximum  excursion  of  the  L2  error  over  the  frame.  These  are  represented  as 
crosses.  The  top  of  the  cross  is  the  largest  L2  error  seen  over  the  frame  of  128  instants  in 
time.  The  bottom  of  the  cross  is  the  smallest  Lq  error  seen  over  the  frame,  and  the  horizon¬ 
tal  bar  in  the  cross  is  the  average  of  the  error  over  the  frame.  The  magnitude  of  these 
values  is  read  off  the  left  vertical  axis.  The  maximum  error  of  all  the  columns  in  the 
frame  is  plotted  as  a  gray  line.  The  scale  for  this  plot  is  read  off  the  left  vertical  axis.  As 
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Figure  5. 11.  The  Li  and  Error  Between  the  Binomial  GDTFD  and  the  Approximate 
Binomial  GDTFD  as  a  Function  of  the  Number  of  Spectrograms  Being  Summed. 


can  be  seen,  both  the  L2  and  error  decrease  rapidly  as  successive  weighted  spectro¬ 
grams  are  added  (i.e.  as  r  in  (5. 17)  is  increased).  The  dashed  line  is  the  logarithm  of  the  av¬ 
erage  of  the  1/2  ^rror  of  each  column  of  the  distribution.  The  scale  of  this  plot  is  read  off 
the  right  vertical  axis. 

It  is  expected,  based  upon  an  analysis  of  the  kernel  in  Figure  5.5b,  that  the  error  be¬ 
tween  the  decimated  Binomial  GDTFD  and  the  approximate  decimated  Binomial  GDTFD 
will  perform  similarly  to  that  seen  in  Figure  5.1 1.  Figure  5.12  shows  the  error  of  between 
these  distributions.  As  expected,  the  error  decreases  as  successive  weighted  ZS’s  are  added 
(i.e.  as  r  in  (5.17)  is  increased). 

The  final  error  of  interest  is  the  error  between  the  Binomial  and  decimated  Binomial 
GDTFD  over  the  region  of  mutual  support.  To  examine  this,  the  error  between  the  Binomi¬ 
al  GDTFD  and  the  approximate  decimated  Binomial  GDTFD  was  calculated.  The  result 
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Figure  5.12.  The  L2  and  Error  Between  the  Decimated  Binomial  GDTFD  and  the 
Approximate  Decimated  Binomial  GDTFD  as  a  Function  of  the  Number  of  Weighted 
2^-Spectrograms- 


of  these  calculations  is  shown  in  Figure  5. 13.  While  the  error  between  the  two  does  de¬ 
crease  with  additional  weighted  ZS  initially,  it  can  be  seen  that  the  error  approaches  a  lim¬ 
it.  In  the  figure,  only  the  first  40  weighted  ZS’s  are  shown,  but  there  is  no  significant  im¬ 
provement  as  additional  weighted  ZS’s  are  included;  thus,  the  Binomial  and  decimated  Bi¬ 
nomial  GDTFD,  while  producing  results  which  are  close,  do  not  produce  convergent 
results. 

This  should  not  be  surprising.  As  shown  in  section  5.5. 1 . ,  the  kernels  are  related  but 
are  not  the  same.  Because  the  distributions  have  different  bandwidths,  the  signal  compo¬ 
nents  will  interact  differently.  That  is,  the  signal  separation,  especially  between  positive 
and  negative  frequencies,  will  be  different;  hence,  the  cross-terms  will  be  affected.  This 
will  impact  the  error  between  the  decimated  and  non-decimated  distributions. 
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Fig;ure  5.13.  The  L2  and  L„  Error  Between  the  Binomial  GDTFD  and  the  Approximate 
Decimated  Binomial  GDTFD  as  a  Function  of  the  Number  of  Weighted  Zak- 
Spectrograms  Being  Summed. 


The  beauty  of  the  SVD  and  Eigenvalue  Decomposition  methods  is  there  is  no  need 
to  calculate  all  of  the  weighted  spectrograms/ZS’s.  For  some  error  threshold,  a  suitable  ap¬ 
proximation  can  be  calculated.  For  the  purposes  of  this  chapter,  an  arbitrary  error  thresh¬ 
old  of  -10*^  was  selected.  This  corresponds  to  somewhere  around  30  weighted  spectro¬ 
grams  in  Figure  5.1 1  and  20  weighted  ZS’s  in  Figure  5.12.  Again  somewhat  arbitrarily,  the 
value  of  r  =  31  is  selected  as  the  approximation  cut-off.  It  will  definitely  result  in  the  dis¬ 
tributions  having  errors  of  less  than  10"^  since  the  32"^  singular  value  is  less  than  lO"^. 
Figure  5.14a  shows  the  final  approximation  to  the  Binomial  GDTFD,  and  Figure  5.14b 
shows  the  final  approximation  to  the  decimated  Binomial  GDTFD.  The  error  between 
these  two  distributions  is  approximately  0.06. 
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Figure  5.14a.  Final  Approximation  to  the  Binomial  GDTFD 
Using  31  Weighted  Spectrograms. 
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Figure  5.14b.  Final  Approximation  to  the  Decimated  Binomial 
GDTFD  Using  31  Weighted  Spectrograms. 


5.7.  Conclusion 

In  this  chapter,  it  has  been  shown  that  the  Zak  transform  is,  with  the  Windowed  Zak 
modification,  a  generalization  of  the  Short-Hine  Fourier  Transform  (STFT).  From  this,  it 
was  shown  that  the  2^  is  a  generalization  of  the  spectrogram.  The  connection  between  the 
ZS  and  the  spectrogram  makes  it  possible  to  create  fast  spectrograms  which  trade  band¬ 
width  for  speed  while  maintaining  the  same  frequency  resolution.  The  ZS  was  then  used 
along  with  the  idea  of  weighted  spectrograms  (via  both  Eigenvalue  Decomposition  and 
SVD)  to  create  Decimated  Time-Frequency  Distributions. 

The  power  of  this  formulation  lies  in  its  connection  with  existing  multirate  tech¬ 
niques  and  the  concept  of  using  small  processing  building  blocks  to  implement  an  arbi- 
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trary  TFD.  Multirate  techniques  provide  a  means  to  implement  the  transforms  using  slow¬ 
er  speed  devices  operating  in  parallel  to  achieve  the  same  throughput  of  standard  computa¬ 
tional  techniques.  For  a  decimation  factor  of  m,  there  is  a  m  fold  increase  in  throughput  (or 
speed  of  calculation).  The  corresponding  reduction  in  discrete  bandwidth  is  from  2n  for 
the  GDTFD  to  2n/m  for  the  D-GDTFD.  An  important  attribute  of  the  D-GDTFD  is  that 
it  requires  significantly  less  storage  than  the  GDTFD.  The  D-GDTFD  requires  only  l/m^ 
of  the  storage  of  the  GDTFD. 

Finally,  an  example  based  upon  the  Binomial  distribution  was  given  to  show  how  the 
ZS  and  spectrogram  could  be  used  to  produce  D-GDTFD  and  GDTFD  respectively.  The 
error  between  the  approximate  GDTFD  and  the  GDTFD  was  examined  as  was  the  error 
between  the  approximate  D-GDTFD  and  D-GDTFD  and  the  error  between  the  approxi¬ 
mate  D-GDTFD  and  the  GDTFD. 

It  has  now  been  shown  that  a  D-GDTFD  does  exist  and  can  be  created  from  any  dis¬ 
crete  kernel.  Chapter  6  will  discuss  the  final  step  in  creating  the  MRTFD:  recombining  D- 
GDTFD’s  into  a  GDTFD. 
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6.  Multirate  Time-Frequency  Distributions 


6.1.  Introduction 

New  Multirate  Time-Frequency  Distributions  (MRTFD)  are  developed  using  the 
multirate  computational  paradigm  (Chapter  3)  combined  with  the  techniques  to  create  a 
discrete  kernel  in  the  time-lag  domain  (Chapter  4)  and  the  ability  to  decimate  a  Time-Fre¬ 
quency  Distribution  (Chapter  S).  The  first  method  is  based  upon  using  the  inner  product 
form  of  the  GDTFD  together  with  the  Singular  Value  Decomposition  of  the  kernel  [Singu¬ 
lar  Value  Decomposition  Multirate  Time-Frequency  Distribution  (SVD  MRTFD)  algo¬ 
rithms];  while  the  second  method  is  based  upon  the  convolutional  interpretation  of  the  out¬ 
er  product  form  of  the  Generalized  Discrete  TFD  [Circular  Convolution  Multirate  Time- 
Frequency  Distribution  (CC  MRTFD)]. 

6.1.1.  Baseline — ^Time-Frequencv  Algorithms.  This  section  will  present  the  algo¬ 
rithms  which  will  serve  as  a  baseline  for  the  rest  o  the  paper.  The  first  algorithm  discussed 
is  the  fast  TFD  algorithm  of  Cunningham  and  Williams  [19] — ^the  only  fast  TFD  algorithm 
to  appear  in  the  literature.  The  second  algorithm  is  the  straightforward  parallel  adaptation 
of  the  method  of  Weighted  Spectrograms  [18]. 

6. 1 . 1 . 1 .  Cunningham  and  Williams  Fast  ITD.  The  fast  algorithm  of  Cun¬ 
ningham  and^^^Iliams  is  based  upon  an  approximation  of  (2.10)  in  which  only  the  r  eigen¬ 
vectors  associated  with  the  largest  eigenvalues  are  used  (r  «  N).  It  is  given  by  [19] 

r  I «  |2 


C^(f,©;V)  = 


I 


-j2n(0(t  +  t,)  , 

+  '  dt, 


(6.1) 


*  =  1  1-00  I 

The  error  is  bound  by  the  magnitude  of  the  largest  eigenvalue  not  used,  i.e. 


s“p/,  ®;v)  I  ^ 


(6.2) 
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At  this  point  it  is  necessary  to  make  an  assumptions  which  will  be  used  throughout 
this  paper.  The  type  of  FFT  used  has  a  direct  impact  on  the  number  of  computations  neces¬ 
sary  to  implement  any  GDTFD.  To  be  sure  that  the  comparison  made  herein  are  on  a  level 
playing  field,  all  the  algorithms  will  be  assumed  to  be  designed  using  the  same  FFT  algo¬ 
rithm  as  a  building  block.  The  split  radix-2  FFT  described  by  Sorensen,  et  al,  in  [46]  re¬ 
quires  N  log2  -  3A^  +  4  multiplies  and  3N  !og2  N-3N  +  4  additions  to  implement.  It  rep¬ 
resents  the  state-of-the-art  and,  as  such,  it  will  be  the  basic  FFT  building  block  used.  It  will 
also  be  assumed  that  all  the  algorithms  will  compute  2N  frequencies  at  N  instants  of  time. 
This  is  done  so  inner  produce  methods  based  upon  the  Weighted  Spectrogram  approach, 
which  calculate  individual  instants  of  time,  cat  be  compared  with  outer  product  methods 
which  calculate  all  N  instants  of  time  at  2N  frequencies.  Using  the  split-radix  2  FFT  of  So¬ 
rensen  to  calculate  all  the  instants  and  frequencies,  the  computational  cost  of  the  Cunning¬ 
ham  and  Williams  Fast  TFD  is  then  rN(2N\og2N  8JV  -t-  4)  multiplications  and 
rNi6N\og2N  +  14iV  +  4)  -  4N^  additions.  The  value,  r,  is  the  number  of  eigenvalues  (and 
hence  spectrograms)  that  the  algorithm  ases  to  approximate  the  GDTFD. 

6.1.1 .2.  The  Parallel  Weighted  Spectrogram  Time-Frequencv  Distribution 
(PWS  I  FD).  For  maximum  throughput,  the  parallel  version  of  this  algorithm  calculates 
each  weighted  spectrogram  in  a  separate  processor.  If  the  kernel  is  not  full  rank  or  an  ap¬ 
proximation  is  used,  the  value  N  is  replaced  by  r  where  r<N.  The  algorithm  is  a  two  stage 
process  where  the  first  stage  calculates  the  individual  spectrograms  and  the  second  stage 
sums  all  the  spectrograms  together. 

The  cost  to  compute  the  first  stage  is  then  driven  by  the  number  of  real  multiplies 
and  additions  needed  to  calculate  the  spectrogram.  For  2N  frequencies,  the  number  of 
multiplies  and  additions  is  given  in  the  column  titled,  “Stage  One.”  The  cost  of  the  second 
stage  is  driven  by  the  number  of  spectrograms  which  must  be  summed  together.  There  are 
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no  multiplications  necessary  and  the  number  of  additions  is  given  in  the  column  titled. 


“Stage  Two.” 

The  time  it  takes  to  calculate  a  distribution  is  dependent  upon  the  greatest  number  of 
calculations  which  must  be  performed  in  each  stage  and  the  number  and  types  of  main 
memory  transactions  which  must  be  done.  The  memory  transactions  are  extremely  device 
dependent  and  are  beyond  the  scope  of  this  paper.  In  Table  4,  the  Longest  Path  is  the  total 
number  of  multiplies  and  additions  needed  to  calculate  a  single  column  of  the  GDTFD  in 
the  most  heavily  loaded  processor  in  each  stage.  This  figure  will  be  used  as  a  device  inde¬ 
pendent  measure  of  the  throughput  of  the  algorithm. 

The  Parallel  Complexity  of  the  algorithm  is  given  by  the  cost  of  the  Longest  Path 
times  the  number  of  parallel  processors  necessary  to  calculate  2N  frequencies  at  N  instants 
of  time,  simultaneously.  The  Sequential  Complexity  is  the  number  of  multiplies  and  addi¬ 
tions  necessary  to  implement  the  algorithm  on  a  sequential  machine. 


Table  6. 1 :  Computational  Costs  of  the  PWS  TFD 


1 

Stage  One 

Stage 

Two 

Longest  Path 

Parallel 

Complexity 

Sequential 

Complexity 

X 

ayviogjA^ 

+  8^  +  4 

0 

2/VIogjyV 
+  8yV  +  4 

N^ilNlog^N 
+  8N  +  4) 

N^ilNlog^N 
+  iN  +  4) 

1 

6A^log2A^ 

+  14/V  +  4 

^Nlog^N 
+  l4Ar  +  4 

N^(imog^N 
+  14N  +  4) 

iv^iemog^N 
+  14/V  +  4) 

+  IN^log^N 

6.2.  Singular  Value  Decomposition  MRTFD 

In  this  section,  a  MRTFD  based  upon  a  modification  of  the  SVD  technique  intro¬ 
duced  in  Chapter  5  is  presented.  The  SVD  MRTFD  is  based  upon  the  inner  product  formu¬ 
lation  of  the  GDTFD,  and  as  a  result,  it  can  be  used  to  calculate  the  GDTFD  at  a  single  in¬ 
stant  in  time.  As  such,  it  is  called  an  isolated  column  method.  In  section  6.2.1,  the  theoret¬ 
ical  framework  for  the  SVD  MRTFD  is  presented.  Then,  in  section  6.2.2,  the  important 
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special  case  of  a  Hermitian  kernel  is  considered.  Finally,  the  resulting  algorithm  and  its  as¬ 
sociated  computational  complexity  are  discussed  in  sections  6.2.3  and  6.2.4,  respectively. 

6.2.1.  Algorithm  Development-A  General  SVD-Based  Multirate  TFD.  The  first 
step  in  creating  a  SVD  MRTFD  is  to  decompose  the  inner  product  operation  into  a  sum  of 
smaller  operations.  Consider  the  inner  product  form  of  the  discrete  TFD, 

where 

xU)  =  fin +  t)e  (6.4) 

where  n  e  Z,  and  k,  fj,  =  0,1,2,...,N-1. 

Let  and  t\  represent  the  odd  and  even  terms  of  tj,  respectively.  Then, 

Cj^  (n,  can  be  rewritten  in  terms  of  and  t\  as 

or  equivalently  as 

Cy.(n,  Ar;\|t)  =  <V(/p /j) 

0  B 

Similarly,  replacing  x  (/j)  with  the  sum  x  (tj)  +  x  (tj) » the  inner  product  form  of  the 
GDTFD  can  be  rewritten  as 


(6.7) 


(6.8) 
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In  the  preceding,  the  odd-even  sampling  performed  on  the  kernel  and  signal  is  equiv¬ 
alent  to  decimating  the  signal  by  two  in  both  the  /j  and  tj  directions  and  decimating  the 
kernel  by 


H  = 


2  0 
P  2 


27. 


(6.9) 


(see,  for  example,  [14]  or  [50],  for  more  detail). 

A  more  general  case  for  arbitrary  square  decimation  will  now  be  developed.  The 
decimation  matrix  for  square  decimation  is  given  by 


H  = 


m  0 
0  m 


=  /n/j. 


(6.10) 


Now,  assume  there  is  an  arbitrary  square  kernel,  such  that  its  dimension,  N,  is  de¬ 
visable  by  m  such  that  let  N/m  =  Le  Z.  Etefine  a  block  matrix,  A\  as  was  done  in  (3.4)  and 
(3.5).  The  blocks  in  (3.4)  are  called  the  decimated  kernels  because,  in  practice,  each  block 
matrix  is  the  result  of  two  dimensional  decimation.  They  operate  on  the  decimated  signal, 
each  decimated  kernel  in  a  different  multirate  channel.  Equation  (3.4)  is  represented  in 
block  form  as  a  notational  convenience,  but  it  is  important  to  remember  that  each  block 
will  be  processed  as  a  separate  entity. 

The  signal  must  also  be  rearranged  to  take  advantage  of  the  block  structure.  Return¬ 
ing  to  x{f)  as  defined  in  (6.4),  the  signal  is  rearranged  as  was  done  in  (2. 15)  to  create  x'{t). 
This  is  the  decimated  signal. 

The  generalized  square  decimation  algorithm  (or  the  square  multirate  algorithm) 
solves 


Cf{n,k,A)  =  <A'(f,,f2)x'(r,),x'(/2)>.  (6.11) 

If  OT  =  2,  then  (6. 1 1 )  and  (6.8)  are  identical.  Further,  if  A  =  (jif ,  then  (6. 1 1 )  is  a  MRTFD. 
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To  form  the  SVD  MRTFD,  the  Singular  Value  Decomposition  of  each  block  of  the 
decimated  kernel  must  be  calculated.  Each  block  of  (3.4)  when  applied  to  the  sub-vector, 
X.  (f)  ,  defined  in  (2.16)  becomes 


(6.12) 


Xo,b,  .  ,  o,  b  ,  .  a,  b  , 

Oj  (t^))  (xjt^),v.  (r,)> 

7=1 

Substituting  the  definition  for  jc,(r)  as  defined  in  (6.4),  letting  ^4  =  \jif  and  summing  over 
all  possible  values  of  a  and  b,  yields  the  equation  for  the  generalized  square  decimation 
DTFD, 

Cf  (ft,  k;\v) 


m-l  m-l  j2nk(a-b) 

(L-\ 

j2nkt\ 

=  X  X- 

/V  a,  b 

X  (t))* f(mt  +  a)e  ^ 

<1*1  1 

i=  1 

^1  =  0 

J 

ri-i 

j2nkt\ 

* 

X 

y  («r  (^))  f{mt  +  b)e 

^1  =  0 

J 

(6.13) 


or  equivalently, 

m-l  m-1  j2K(a-b)k  r. 

C/«,t;y)=XI'  "  I®, <*(*)(<,»(«)* 

a  =  0  b  =  0  i=] 

where 
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L-l  J2nkt 

=  S  iw'l'^(t))*f(mt  +  a.)e  ^  (6.15) 

f  =  0 

and  wr’  ^  represents  either  v^’  *  or  ^ .  Note  that  (^)  is  periodic  with  period  L  = 
NAn  (i.e.  „  (fc)  =  ^(pL  +  k)  ,pe  Z).  Using  this  observation  together  with 

(6.14),  (6.13)  can  be  rewritten  as 
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wherep  =  0,  l,...,m-  1  and^i  =0, 1. 

Alternately,  letting  d  =  a- b  and  c  =  a  +  b,  equation  (6.16)  could  be  written  as 
Cfin,  pL  +  fc,;\j/) 
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This  is  easier  to  implement  as  it  makes  the  outer  summation  a  DFT  which  can,  of  course 
be  solved  by  taking  the  FFT.  This  is  the  SVD  MRTFD  for  the  general  case,  no  assumption 
have  been  made  as  to  the  form  of  the  kernel. 

Equation  (6.17)  can  be  thought  of  as  summing  the  result  (6.13)  along  each  diagonal 
of  (3.4).  This  results  in  a  single  L  point  vector  for  each  diagonal.  The  vectors  form  an  array 
which  has  columns  indexed  by  -m  +  1  to  m  -  1  representing  each  diagonal  and  rows  in¬ 
dexed  by  0  to  L  -  1 .  If  the  array  is  broken  into  two  arrays  [0,L)  x  (-m,0]  and  [OX)  x  [0,- 
m)  (the  zero  column  is  duplicated)  and  the  DFT  of  the  rows  are  taken  and  summed  togeth¬ 
er,  the  result  is  almost  the  GDTFD  for  time  n.  Since  the  zero  column  was  included  twice,  it 
must  be  subtracted  to  produce  the  true  GDTFD. 

6.2.2.  A  SVD  MRlFD  for  Hermitian  Kernels.  It  is  possible  to  significantly  im¬ 
prove  the  performance  of  the  SVD  MRTFD  algorithm  by  assuming  that  the  kernel  is  Her- 

6.7 


mitian.  Only  a  Hermitian  kernel  will  produce  a  strictly  real  TFD,  and  realness  is  a  very 
common  constraint  placed  upon  TFD’s. 

Given  that  the  kernel  is  Hermitian,  the  sub-kernels  along  the  main  diagonal  of  (3.4) 
are  also  Hermitian.  This  can  be  seen  by  examining  the  elements  of  (3.5)  when  i  =  j.  Since 
the  elements  of  A  are  Hermitian,  i.e.  a-  j  =  a*j  , ,  the  diagonal  elements  of  (3.4)  will  also 
be  Hermitian  since  a .  ^  ^  =  a*.-  ^  j  ^  when  i  =  j. 

The  off-diagonal  elements  of  (3.4),  on  the  other  hand,  will  not  be  Hermitian,  but 
each  sub-kernel  will  be  the  complex  conjugate  transpose  of  transpose  elements  of  A'  (i.e. 
A.  J  =  a\  .  where  t  indicates  complex  conjugate  transpose).  This  can  be  seen  by  examin¬ 
ing  the  elements  of  and  A^ ,.  The  elements  of  A,j  are  n,  + 1,„  j  +  cm  ^^d  the  elements  of 

Aji  are  aj  +  i,m,  i  +  cm  where  b.ce  [0,...,Mm  -  1].  It  follows  that  the  elements  of  Aj  ^  are 
+  cm,  i  +  bnf  but  this  is  the  same  as  A;  +  since  a.  j  =  a* j  ,  is  given;  thus, 

A  -  J  =  a!  J. .  This  property  makes  it  unnecessary  to  compute  both  the  sub-kernels  above 
and  below  the  main  diagonal. 

t 

From  the  relationship  A,,  j  =  Aj  ^ ,  it  is  possible  to  show  that  the  singular  values  of 
the  transpose  element  of  A'  are  the  same.  This  combined  with  the  fact  that  the  left  singular 
vectors  of  A  are  the  eigenvectors  of  AA^  and  the  right  singular  vectors  are  the  eigenvectors 
of  A^A,  it  can  be  seen  that  the  left  (right)  singular  vectors  of  A,y  and  the  right  (left)  singular 
vectors  of  A^  ,•  are  the  same.  This  can  be  used  to  reduce  the  complexity  of  (6.17). 

Consider  (6. 13)  for  just  the  case  of  a  transpose  pair  of  sub-kernels  (i.e.  A/j  and  Ay ,), 
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This  is  equivalent  to 
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Since  the  second  part  of  the  sum  in  (6.19)  is  just  the  complex  conjugate  of  the  first  part, 
only  the  real  part  of  one  of  them  must  be  calculated  and  the  result  multiplied  by  two. 
The  resulting  improved  algorithm  is  given  by 
Cj{n,  pL  +  fc,;\|r) 
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(6.20) 


This  is  the  SVD  MRTFD  for  Hermitian  kernels.  It  is  important  to  note  that  for  the  sub-ker¬ 
nels  along  the  main  diagonal  of  (3.4),  the  eigenvalue  decomposition  may  be  used  in  place 
of  the  SVD — resulting  in  a  reduction  by  a  factor  of  two  of  the  number  of  STFT’s  that  need 
to  be  performed  for  these  sub-kernels. 

6.2.3.  The  SVD  MR'IFD  Algorithm.  The  algorithm  given  in  this  section  will  im¬ 
plement  the  SVD  MRTFD  for  Hermitian  kernels.  The  algorithm  for  the  general  case  given 
in  section  6.2. 1  can  be  obtained  by  a  veiy  simple  modification  of  the  one  presented  below. 
These  nuxlifications  will  be  briefly  discussed  at  the  end  of  this  section. 

Using  the  singular  vectors  as  windows  and  the  singular  values  as  weights,  the  first 
stage  calculates  each  S  I  FT  pair  and  multiplies  them  together  for  each  sub-kernel  off  the 
main  diagonal.  For  the  main  diagonal  sub-kernels,  the  corresponding  eigenvectors  are 
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used  as  window  functions  and  the  eigenvalues  are  used  as  weights.  Stage  two  calculates 
the  sum  of  the  each  sub-block.  The  result  is  a  [0,  L  -  I  ]  x  [0,  m  -  I  ]  matrix.  In  the  third 
stage,  the  FFT  of  each  row  is  taken,  the  real  portion  is  multiplied  by  two  and  each  k'l'  row 
is  subtracted  by  the  constant  (which  is  also  real)  representing  the  k'l*  element  of  the  sum¬ 
mation  of  the  blocks  of  the  main  diagonal.  If  the  summation  stage  is  implemented  using  a 
binary  tree  structure,  then  it  is  possible  to  sum  the  L  outputs  of  m  processors  in  the  time  it 
would  take  one  processor  to  perform  Llog2m  additions. 

For  maximum  throughput,  this  algorithm  requires  r^j,  processors  for  each  block; 
thus,  a  total  of  rg  b{nP'  +  m)/2  processors  are  necessary.  The  calculation  of  the  main  diago¬ 
nal  (via  eigenvalues)  is  slightly  different  than  the  remainder  of  the  blocks  (via  SVD). 
Algorithm  6. 1 ;  SVD  MRTFD  for  Hermitian  Kernels 
Barrier  0:  Start 

Load  each  of  the  +  m)f2  processors  with  the  appropriate  por¬ 
tions  of  the  signal’ 

Calculate  the  two  STFT’s  (one  in  the  case  of  the  main  diagonal) 

Pointwise  multiply  the  result  (spectrogram  for  the  main  diagonal) 

Barrier  1: 

Sum  each  block  and  along  each  diagonal 
Output  the  result  of  the  sum  of  each  diagonal  as  a  column  in  a  matrix 

Load  the  rows  of  the  matrix  into  L  processors 
Take  the  FFT  of  each  row 
Multiply  the  real  part  by  two 
Subtract  by  the  main  diagonai 
Barrier  3:  End 

In  the  general  case  of  the  SVD  MRTFD  algorithm,  processors  would  be  re¬ 
quired  for  maximum  throughput.  These  would  be  loaded  with  the  appropriate  sub-kernels 
and  the  two  STFT’s  would  be  calculated  for  all  of  the  blocks.  This  includes  the  main  diag- 


Stage  1: 


Stage  2: 


Stage  3: 
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onal  sub-kernels  which  are  no  longer  Hermitian,  and  as  such,  they  cannot  make  use  of 
eigenvalues  and  eigenvectors.  All  other  operations  would  be  the  same  as  in  the  algorithm 
for  Hermitian  kernels. 

6.2.4.  Computational  Costs  of  the  Improved  Algorithm.  The  computational  costs 
described  in  this  section  are  for  the  SVD  MRTFD  for  Hermitian  kernels.  The  modifica¬ 
tions  to  these  results  for  the  general  case  will  be  briefly  discussed  at  the  end  of  the  section. 

The  longest  path  in  the  first  stage  of  the  algorithm  is  the  number  of  additions  and 
multiplications  it  takes  to  compute  two  L  point  FFT’s  and  a  L  point  complex  multiply.  The 
cost  of  stage  two  is  the  cost  to  compute  the  sums  of  the  blocks  and  the  diagonals.  Stage 
three  has  a  cost  driven  by  a  m  point  FFT  and  a  m  point  real  multiply  and  add. 


Table  6.2:  Computational  Costs  of  the  SVD  MRTFD 


Stage  One 

Stage  Two 

Stage  Three 

Longest  Path 

1 

0 

mlogj/n  -  2m  +  4 

m  m 

+  mlog2m-2m+  12 

1 

Smlogjm  -  2m  +  4 

m  ^\mJ  m 

+  3mlog2m  -  2m  +  1 2 

Table  6.2:  Computational  Costs  of  the  SVD  MRTFD  (continued) 


Sequential  Complexity 


Parallel  Complexity 

r(m^  +  m)^^log2( 
+  ^log2m-m  +  6 
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m 

+  ^•082  (Vfc'") 

+  3mlog2m  -  2m  +  12  j 

1,25' 

/  m 

j  +  m  +  I  j 

+  Nlog^m  -  2N  +  4—  +  4rm^ 
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In  order  to  compare  this  new  algorithm  to  the  baseline  algorithms  in  section  6. 1 . 1 ,  it 
will  be  assumed  that  the  memory  access  times  and  associated  communications  costs  are 
negligible  compared  to  the  computational  costs.  The  time  it  takes  to  compute  the  parallel 
baseline  and  SVD  MRTFD  with  Hermitian  kernels  are  displayed  relative  to  the  time  it 
takes  to  compute  the  Cunningham  and  Williams  fast  algorithm.  In  Figure  6. 1 ,  the  line  indi¬ 
cating  the  fraction  of  the  time  it  takes  to  compute  the  parallel  baseline  compared  to  the 
Cunningham  and  Williams  baseline  is  labeled  “Parallel  Baseline  Algorithm.”  The  parallel 
baseline  is  calculated  as  detailed  in  section  6. 1.1. 2  where  each  weighted  spectrogram  is 
calculated  in  a  separate  processor.  The  fastest  performance  possible  by  this  algorithm  is 
the  time  it  takes  to  calculate  a  2N  point  weighted  spectrogram. 

The  relative  time  it  takes  to  compute  the  SVD  MRTFD  compared  to  the  Cunning¬ 
ham  and  Williams  baseline  is  given  for  two  levels  of  decimation,  m.  As  can  be  seen,  as  the 
level  of  decimation  is  increased  the  relative  time  it  takes  to  calculate  the  distribution  de¬ 
creases.  This  is  due  to  the  increased  number  of  parallel  paths  and  the  reduced  size  of  the 
problem  each  path  must  solve.  The  total  number  of  additions  and  multiplications  will  rise 
somewhat  with  increasing  levels  of  decimation.  The  plot  compares  the  total  number  of 
multiplies  and  additions  for  the  three  algorithms. 

The  speed  of  the  Parallel  Baseline  and  SVD  MRTFD  are  largely  unaffected  by  the 
number  of  singular  values  and  eigenvalues  weights  which  are  used.  Since,  for  maximum 
throughput,  it  is  assumed  that  there  are  as  many  parallel  paths  as  desired,  additional  singu¬ 
lar  values  and  eigenvalues  weights  will  only  slightly  increase  the  number  of  additions  per¬ 
formed  in  the  second  stage,  and  compared  to  the  cost  of  the  overall  algorithms,  these  addi¬ 
tions  can  be  neglected  Thus,  the  time  to  compute  these  algorithms  compared  to  the  Cun¬ 
ningham  and  William  baseline  are  unaffected  by  the  accuracy  of  the  Parallel  Baseline  and 
the  SVD  MRTFD. 
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Figure  6.1.  Comparison  of  Cost  to  Comfmte  Parallel  Baseline  and  SVD  MRTFD  with 
Normalized  Cost  of  Cunningham  and  Williams  Baseline  Algorithm. 


For  the  case  of  a  signal  which  is  128  point  long,  the  SVD  MRTFD  with  a  decimation 
factor  of  eight  can  have  and  increase  throughput  of  over  SO  fold  relative  to  the  baseline 
Cunningham  and  Williams  algorithm.  For  a  decimation  factor  of  16,  the  throughput  can 
increase  by  a  factor  of  100. 

The  cost  of  the  general  (non-Hermitian)  SVD  MRTFD  is  only  slightly  larger  than  the 
SVD  MRTFD  with  Hermitian  kernels.  If  it  is  assumed  that  there  are  as  many  parallel  paths 
as  desired,  then  the  cost  is  roughly  the  same  between  the  two  algorithms;  however,  the 
general  SVD  MRTFD  will  require  twice  as  many  parallel  paths  to  achieve  that  throughput. 

6.3.  Circular  Convolution  Multirate  Time-Frequency  Distribution 

TTie  CC  MRTFD  is  based  upon  the  outer  product  form  of  the  GDTFD,  and  as  such,  it 
is  a  two  stage  algorithm.  It  is  a  block  Time-Frequency  method  calculating  every  column 
(instants  of  time)  in  a  block,  simultaneously.  First,  it  performs  a  multirate  circular  convo- 
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Equation  (6.21)  can  be  thought  of  as  a  two  stage  processes.  The  first  stage  calculates  the 
circular  convolution  of  the  row  of  the  hexagonally  decimated  kernel,  t|r,  and  the  bilinear 
signal,  Rf.  Each  row  is  independent  and  can  be  broken  off  as  a  separate  process.  The  output 
of  this  stage  is  placed  in  a  rectangular  array  indexed  by  n  and  k  such  that  N  e  [-N/2,..., 
N/2  -  1 }  and  k  e  {-N,  1 }.  The  second  stage  takes  the  output  of  the  circular  convo¬ 

lutions  which  now  lie  on  a  rectangular  grid  and  calculates  the  Fourier  transform  of  each 
colunm,  n. 
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The  main  computational  building  block  in  the  CC  MRTFD  is  the  MR  FFT  which 
was  presented  in  Chapter  3.  The  MR  FFT  is  a  two  stage  process.  A  convolution  built  upon 
the  MR  FFT  is,  in  general,  a  four  stage  process.  If  the  kernel  is  fixed  and  can  be  calculated 
a  priori,  the  MR  FFT  based  circular  convolution  is  a  three  stage  process. 

To  perform  the  multirate  circular  convolution,  first  zero  pad  the  rows  of  the  kernel  to 
prevent  convolution  aliasing  and  perform  the  MR  FFT  of  the  kernel  (this  will  be  precom¬ 
puted  a  priori  for  fixed  kernels).  Next,  the  MR  FFT  of  the  rows  of  the  bilinear  signal  are 
calculated  followed  by  the  pointwise  multiplication  of  the  transformed  kernel  and  trans¬ 
formed  bilinear  signal,  i.e. 


x(0)y(0) 

■ 

X(l)Y(l) 

■[<"-"^'1 

)] 

- 1 

1 

y-N 

1 

where  X(i)  and  V(i)  is  a  particular  Fourier  coefficient  of  the  signal  and  kernel,  respectively. 
If  each  row  of  (6.24)  is  local  to  a  single  processor,  then  there  is  no  synchronization  re¬ 
quirement  and  the  inverse  MR  FFT  can  be  directly  calculated. 

The  inverse  MR  FFT  is  performed  by  first  calculating  the  inverse  FFT  of  the  rows  of 
(6.24)  followed  by  the  application  of  the  inverse  phase  shift.  At  this  point  a  barrier  is  en¬ 
countered  and  the  second  stage  is  complete.  The  result  of  this  operation  is  the  Zak  trans¬ 
form  of  the  convolution  of  a  row  of  the  kernel  and  bilinear  signal.  The  final  step  is  to  cal¬ 
culate  the  inverse  FFT  of  the  columns  to  produce  the  circular  convolution  of  a  single  row 
of  the  kernel  and  the  bilinear  signal.  The  complete  process  is  illustrated  in  Figure  6.2. 

This  method  for  calculating  a  multirate  circular  convolution  is  applied  to  all  the  rows 
of  the  bilinear  signal  and  the  kernel.  Each  of  the  2iV  multirate  convolutions  is  a  three  stage 
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Figure  6.2.  A  Three  Stage  Multirate  Circular  Convolution.  Y  Represents  the 
Fourier  Coefficients  of  the  Kernel. 


process;  however,  each  convolution  is  independent  and  all  must  be  completed  prior  to  tak¬ 
ing  the  Fourier  transform  of  the  columns. 

Once  all  the  convolutions  have  been  calculated,  it  is  time  to  perform  the  Fourier 
transform  of  the  columns.  The  columns  are  calculated  using  the  multirate  FFT.  The  two 
stages  of  the  MR  FFT  are  considered  to  be  sub-stages,  and  the  Fourier  transform  of  the  all 
the  columns  is  Stage  Two.  With  the  completion  of  Stage  Two,  the  GDTFD  is  done. 

6.3.2.  The  CC  MRl  FD  Algorithm.  The  loading  of  the  individual  processors  aver¬ 
ages  far  less  than  100  percent  with  this  algorithm.  If  throughput  is  maximized,  then  the  2N 
convolutions  requite  processors  each  resulting  in  a  total  of  (21V)  processors  for 
the  first  stage;  however,  only  IV  MR  FFT  need  to  be  done  in  the  second  stage  which  sug¬ 
gests  that  for  maximum  throughput  a  total  of  nJw  processors  are  needed.  Thus,  half  of 
the  processor  are  idle  during  the  second  pass;  however,  the  second  pass  is  not  as  long  as 
the  first  so  the  total  idle  time  for  the  processors  is  less  than  50  percent.  A  system  which  us- 
es  (2  AO  processor  will  be  called  the  fast  CC  MRTFD  since  this  configuration  would 
maximized  throughput. 

An  alternative  algorithm  can  be  defined  which  maximizes  processor  utilization.  If 
the  input  is  broken  into  two  IV  x  IN  arrays  which  are  processed  sequentially  prior  to  calcu¬ 
lating  the  Fourier  transform  of  the  column,  none  of  the  processors  will  be  idle  during  the 
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Figure  6.3.  Block  Diagram  of  Circular  Convolution  Multirate  TFD.  The  Function 
jc,<0  and  y,(/)  Refer  to  the  Row  of  the  Bilinear  Signal  and  Kernel,  Respectively. 


second  stage.  This  algorithm  will  increase  the  time  it  takes  to  compute  the  GDTFD  by 
slightly  more  than  50  percent.  This  algorithm  will  be  called  the  efficient  CC  MRTFD  since 
it  maximizes  processor  utilization.  A  block  diagram  of  this  algorithm  is  shown  in  Figure 
6.3. 

It  is  assumed  that  m  =  Jw  in  the  both  algorithms  to  prevent  excess  idle  time  in 
the  processors  during  each  MR  FFT.  The  algorithms  flow  as  shown  in  Algorithm  3.2 

6.3.3.  The  Computational  Cost  of  the  Circular  Convolution  MRTFD.  The  cost  for 
the  longest  path  in  stage  one  is  the  cost  of  one  IN  point  MR  convolution.  Assuming 
m  =  JlN  and  the  MR  FFT  of  the  kernel  has  been  pre-calculated,  the  cost  of  the  longest 
path  is  twice  the  cost  of  a  MR  FFT  plus  JlJi  point  complex  multiply  for  the  fast  algo- 
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Algorithm  6.2:  CC  MRTFD  Algorithms 


Fast  CC  MRTFD 

Efficient  CC  MRTFD 

Barrier  0:  Start 

Barrier  0:  Start 

Stage  I: 

Load  2N  rows  of  bilinear 
signal  into  JlN  proces¬ 
sors  per  row 

Stage  1: 

Load  first  A/ rows  of  bilinear 
signal  into  JlN  processor 
per  row 

Each  set  of  J2N  proces¬ 
sors  perform  a  MR  convo¬ 
lution 

Each  set  of  JlN  proces¬ 
sors  perform  a  MR  convo¬ 
lution 

Place  result  in  temporary 
storage 

Place  result  in  temporary 
storage 

Load  second  N  rows  of 
bilinear  signal  into  JlN 
processor  per  row 

Each  set  of  JlN  proces¬ 
sors  perform  a  MR  convo¬ 
lution 

Place  result  in  temporary 
storage 

Barrier  1: 

Barrier  1: 

Stage  2: 

Load  the  -/W2  XoNl2-^ 
columns  of  the  temporary 
storage  into  the  Jw  pro¬ 
cessors  per  column 

Stage  2: 

Load  the  -M2  to  M2  - 1 
columns  of  the  temporary 
storage  into  the  JlN  pro¬ 
cessors  per  column 

Each  set  of  JlN  proces¬ 
sors  perform  a  MR  FFT 

Each  set  of  JlTf  proces¬ 
sors  perform  a  MR  FFT 

Barrier  2:  End 

Barrier  2:  End 

rithm  and  twice  this  figure  for  the  efficient  algorithm.  The  cost  of  the  second  stage  for  both 


algorithms  is  identical  and  is  equivalent  to  the  longest  path  of  a  single  MR  FFT.  The  long¬ 
est  path  for  the  CC  MRTFD  is  the  sum  of  the  firet  and  second  stages.  The  Parallel  Com- 
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plexity  is  simply  the  Longest  Path  times  the  number  of  processors:  in  this  case,  (IN) 
for  the  fast  algorithm  and  nJiN  for  the  efficient  algorithm.  The  Sequential  Complexity 
is  the  same  for  both  algorithms  and  is  the  twice  the  sequential  cost  of  a  MR  convolution 


plus  a  MR  FFT  times  nJiN  . 


Table  6.3:  Computational  Costs  of  the  Fast  CC  MRTFD 


1 

Stage  One 

Stage  Two 

Longest  Path 

Parallel 

Complexity 

Sequential 

Complexity 

X 

iJWlog^N 
-J2N+  10 

JlNlog^N 

-2J2N  +  5 

3j2Nlog2N 
-3J2N+  15 

2N(6N\og2N 
-6N+  15J2N) 

N{mN\og^N 

-%N 

+  10..^+  15) 

1 

eJiNlogjN 
+  3J2N+IQ 

3j2Nlog^N  +  5 

^JTNXog^N 
+  3J2N+  15 

2yV(18Mog2^ 

+  6N+  \5JlN) 

N{3Qmog^N 
+  \2N 

+  \Qj2N+  15) 

Table  6.4;  Computational  Costs  of  the  Efficient  CC  MRTFD 


1 

Stage  One 

Stage  Two 

Longest  Path 

Parallel 

Complexity 

Sequential 

Complexity 

X 

AjWXog^N 
-2JW  +  2O 

-^logjA^ 

-2J2N  +  5 

5j2NXog2N 

-4J2N  +  25 

NiX0NXog2N 
-%N  +  25j^) 

N(X0NXog2N 

-8/V 

+  XOJ2N+  15) 

1 

12^/2iVlog2^V 
+  6J2N  +  2O 

SjWXog^N  +  S 

XSjWXog^N 
+  6jW  +  25 

A/(30Mog2)V 
+  X2N  + 25  JW) 

/VOO/VlogjN 
+  X2N 

+  107^+15) 

As  was  done  in  the  case  of  the  SVD  MRTFD,  a  comparison  of  the  time  it  takes  to 
compute  the  CC  MRTFD  with  the  baseline  algorithms  must  be  performed.  Again,  the  as¬ 
sumption  is  made  that  the  time  to  compute  is  directly  proportional  to  the  number  of  multi¬ 
plies  and  additions  needed  to  calculate  the  longest  path  and  communications  costs  are  neg¬ 
ligible. 

In  order  to  compare  the  CC  MRTFD  algorithm  to  the  baseline  algorithms  found  in 
section  6.1.1,  it  will  be  assumed  that  the  memory  access  times  and  associated  communica¬ 
tions  costs  are  negligible  compared  to  the  computational  costs.  The  time  it  takes  to  com¬ 
pute  the  parallel  baseline  and  the  fast  and  efficient  CC  MRTFD’s  are  displayed  relative  to 
the  time  it  takes  to  compute  the  Cunningham  and  Williams  fast  algorithm.  In  Figure  6.4, 
the  line  indicating  the  fraction  of  the  time  it  takes  to  compute  the  parallel  baseline  com- 
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Figure  6.4.  Comparison  of  Efficient  CC  MRTFD  and  Fast  CC  MRTFD  to 
Parallel  Baseline  Algorithm  and  Normalized  Cunningham  and  Williams 
Baseline  Algorithm. 


pared  to  the  Cunningham  and  Williams  baseline  is  labeled,  “Parallel  Baseline  Algorithm.” 
The  parallel  baseline  is  calculated  as  detailed  in  section  6. 1.1. 2  where  each  weighted  spec¬ 
trogram  is  calculated  in  a  separate  processor.  The  fastest  performance  possible  by  this  al¬ 
gorithm  is  the  time  it  takes  to  calculate  a  2N  point  weighted  spectrogram. 

Figure  6.4  shows  the  potential  speed  up  of  the  CC  MRTFD  over  the  Cunningham 
and  Williams  and  parallel  baselines.  Note  that  unlike  the  SVD  MRTFD  which  is  depicted 
with  constant  values  of  m,  the  CC  MRTFD  uses  a  decimation  factor,  m,  which  changes 
with  the  length  of  the  signal  since  m  =  . 

The  CC  MRTFD  produces  a  distribution  which  is  equivalent  to  having  kept  all  the 
singular  values  and  eigenvalue  weights.  The  Parallel  Baseline’s  throughput  is  largely  unaf¬ 
fected  by  increased  accuracy,  as  before.  Thus,  both  the  CC  MRTFD  and  Parallel  Baseline 
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provide  methods  for  calculating  the  actual  GDTFD  while  the  Cunningham  and  Williams 
Baseline  provides  only  an  approximation. 

For  the  case  of  a  128  point  long  signal,  the  potential  speed  up  of  the  CC  MRTFD 
over  the  baseline  is  on  the  order  of  1000  fold  for  the  efficient  CC  MRTFD  and  slightly 
more  for  the  fast  CC  MRTFD.  The  decimation  factor  which  is  used  for  this  length  signal  is 
16  which  implies  2048  processors  are  needed  for  maximum  throughput  for  the  efficient 
CC  MRTFD  and  4096  processors  are  needed  for  maximum  throughput  for  the  fast  CC 
MRTFD  to  achieve  this  increase. 

6.4.  Comparison  of  SVD  MRl'FD  and  CC  MRTFD 

If  the  application  to  which  the  MRTFD  is  applied  requires  the  time-frequency  distri¬ 
bution  at  each  possible  instant  of  time,  the  CC  MRTFD  algorithm  is  the  most  logical 
choice.  When  calculating  a  block  of  data  the  CC  MRTFD  can  be  over  an  order  of  magni¬ 
tude  faster  than  the  SVD  MRTFD  for  the  same  number  of  parallel  processors.  The  SVD 
MRTFD  is  appropriate  when  only  a  fraction  of  the  distributions  at  the  possible  instants  of 
time  need  to  be  calculated. 

The  performance  of  the  SVD  MRTFD  can  be  improved  by  taking  an  approximation 
to  the  distribution;  however,  as  the  decimation  factor  rises  the  benefit  associated  with  the 
approximation  decreases.  As  the  decimation  factor  increases,  the  size  of  the  sub-kernels 
decrease.  An  approximation  works  by  keeping  a  set  number  of  singular  values  or  eigenval¬ 
ues  and  excluding  the  remainder.  The  performance  is  improved  since  fewer  STFT’s  and/or 
spectrograms  must  be  calculated,  but  for  smaller  sub-kernels,  the  number  of  singular  val¬ 
ues  or  eigenvalues  excluded  decreases;  hence,  the  throughput  improvement  of  the  approx¬ 
imation  decreases. 

When  deciding  which  algorithm  to  use  the  deciding  factor  is  generally  whether  or 
not  the  distribution  needs  to  be  calculated  at  each  instant  of  time.  If  it  does,  then  the  CC 
MRTFD  is  the  clear  choice.  If  it  does  not,  then  the  SVD  MRTFD  may  be  appropriate.  A 
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(a) 


(b) 


(c) 


Figure  6.5.  Comparison  of  Distribution  Calculated  by  (a)  CC  MRTFD,  (b)  Cunningham 
and  Williams  Baseline  Using  Seven  Eigenvalues  and  a  Condition  Number  of  1 .5, 

(c)  SVD  MRTFD  Approximation  Using  a  Condition  Number  of  1 .5. 

good  estimate  of  the  point  at  which  the  SVD  MRTFD  becomes  the  appropriate  choice  is 
when  less  than  one-in-ten  instants  in  time  need  to  be  calculated.  The  exact  cross-over  point 
depends  upon  the  length  of  the  signal,  the  size  of  the  kernel,  the  decimation  factor  and  the 
number  of  available  processors.  To  determine  the  cross-over  point,  the  Longest  Path  times 
the  number  of  parallel  paths  needed  for  a  given  implementation  of  the  CC  MRTFD  and 
SVD  MRTFD  must  be  compared. 


In  this  section  an  example  is  given  which  compares  the  Cunningham  and  Williams 
fast  algorithm,  the  SVD  MRTFD  and  CC  MRTFD  for  the  Binomial  kernel.  The  signal  be¬ 
ing  analyzed  is  composed  of  a  rising  chirp  and  a  tone.  The  chirp  has  twice  the  magnitude 
of  the  tone. 

In  Figure  6.5a,  the  distribution  generated  by  the  CC  MRTFD  algorithm  is  shown.  It 
is  identical  to  the  distribution  that  would  be  created  by  means  of  the  Alias-Free  General¬ 
ized  Discrete  Time-Frequency  Distribution  (See  Chapter  4  or  [29].),  Weighted  Spectro¬ 
grams  where  all  non-zero  eigenvalues  are  used  [18]  or  by  the  SVD  MRTFD  when  all  non¬ 
zero  singular  values  are  used. 


The  GDTFD  calculated  via  the  baseline  Cunningham  and  Williams  fast  algorithm  is 
shown  in  Figure  6.5b.  This  algorithm  obtains  its  speed,  primarily,  by  reducing  the  number 
of  eigenvalues  that  are  kept  and,  hence,  the  number  of  weighted  spectrograms  which  must 
be  calculated.  Figure  6.5b  is  the  distribution  which  results  when  only  the  seven  largest 
eigenvalues  are  kept.  This  is  equivalent  to  limiting  the  condition  number  of  the  kernel  ap¬ 
proximation  to  1.5.  For  this  data  set,  the  resulting  distribution  is  very  close  to  the  actual 
distribution  in  Figure  6.5a,  but  it  is  an  approximation.  The  I2  error  of  the  approximation  is 
bound  by  the  largest  magnitude  of  the  eigenvalues  not  included.  In  this  case,  the  error  is 
bound  by  0.667. 

Figure  6.5c  shows  the  approximation  to  the  distribution  which  is  obtained  when  not 
ail  of  the  singular  values  are  kept  in  the  SVD  MRTFD  algorithm.  One  of  the  advantages  of 
the  SVD  MRTFD  is  the  ability  to  control  both  the  number  of  singular  values  kept  in  each 
sub-kernel  and  the  decimation  factor,  which  in  turn  affects  both  throughput  and  the  num¬ 
ber  of  non-zero  singular  values.  To  more  accurately  compare  with  Figure  6.5b,  all  the 
blocks  of  the  SVD  MRTFD  have  been  limited  to  an  I2  error  bound  of  0.667  (a  condition 
number  of  1.5).  The  number  of  singular  values  kept  in  each  of  the  sub-kernels  ranges  from 
one  to  nine  depending  upon  how  quickly  the  magnitudes  of  the  singular  values  decay.  This 
effect  is  known  as  deflation.  It  should  be  noted  that  this  distribution  is  an  approximation 
like  the  baseline  fast  algorithm,  but  it  is  not  the  same  approximation. 

6.6.  Conclusion 

Two  new  fast  computational  methods  have  been  demonstrated  for  the  calculation  of 
Generalized  Discrete  Time-Frequency  Distributions.  The  computational  paradigm  under¬ 
lying  the  algorithms  is  multirate.  The  SVD  MRTFD  method  is  based  upon  the  inner  prod¬ 
uct  formulation  of  the  GDTFD  and  as  such  it  can  be  used  to  calculate  the  frequency  con¬ 
tent  of  a  signal  for  a  particular  instant  in  time.  Even  for  modest  decimation  value  of  eight, 
throughput  can  be  increased  by  as  much  as  50  fold  over  the  current  state-of-the-art  Cun- 
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ninghatn  and  Williams  fast  algorithm  and  by  an  order  of  magnitude  over  the  parallel  base¬ 
line  algorithm.  The  SVD  MRTFD  is  less  computationally  expensive  when  calculating  se¬ 
lected  columns  of  the  GDTFD,  while  the  CC  MRTFD  is  less  expensive  when  calculating 
every  column  of  the  GDTFD. 
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The  most  significant  and  fundamental  result  presented  in  this  dissertation  is  the  new 
computational  paradigm  for  multirate.  Viewing  multirate  as  a  computational  paradigm  ex¬ 
tends  the  multirate  benefits  (reduced  required  speed  of  computational  elements,  reduced 
cost,  and  increased  throughput)  to  a  much  wider  class  of  problems  than  just  the  design  and 
implementation  of  filter  banks.  Specifically,  multirate  can  be  applied  to  any  problem 
which  can  be  expressed  as  a  set  of  vector-vector,  matrix-vector  or  matrix-matrix  opera¬ 
tions  or  any  combination  of  these  three  types  of  operations  [34].  It  can  and  does  lead  to 
fast  and  parallel  algorithms  by  revealing  the  underlying  parallelism  and  the  inherent  recur¬ 
rent  nature  of  a  particular  problem. 

Considering  multirate  as  a  computational  model  opens  new  avenues  for  multirate  as 
a  divide  and  conquer  formalism.  In  this  application,  multirate  could  be  used  to  solve  any 
I»Y>blem  in  numerical  linear  algebra.  More  importantly  to  the  field  of  signal  processing, 
there  is  a  very  large  class  of  problems  which,  at  heart,  are  numerical  linear  algebra  prob¬ 
lems.  Thus,  in  signal  processing,  multirate  can  be  used  both  as  a  means  to  design  and  im¬ 
plement  filter  banks  and  as  an  underlying  computational  paradigm  for  other  types  of  prob¬ 
lems.  For  example,  multirate  was  applied  to  the  Fast  Fourier  Transform  (FFT)  and  Dis¬ 
crete  Hartley  Transform  (DHT)  to  produce  fast,  parallel  versions  of  these  well  know  signal 
processing  algorithms[34].  In  fact,  the  remainder  of  the  dissertation  was  an  extended  ex¬ 
ample  of  this  paradigm. 

Creation  of  a  Multirate  Hme-Frequency  Distribution  (MRTFD)  algorithm  opens  the 
door  for  a  new  class  of  fast  and  parallel  algorithms  for  Time-Frequency  Analysis  [33].  The 
first  algorithms  in  this  new  class  are  the  Singular  Value  Decomposition  MRTFD  (SVD 


MRTFD)  and  the  Circular  Convolution  MRTFD  (CC  MRTFD)  which  demonstrate  the  po¬ 
tential  to  increase  the  throughput  of  a  Generalized  Discrete  Time-Frequency  Distribution 
(GDTFD)  by  well  over  an  order  of  magnitude.  Specifically,  the  two  MRTFD’s  presented 
here  yield  increases  in  throughput,  compared  to  the  Cunningham  and  Williams  baseline  al¬ 
gorithm  (the  fastest  algorithm  reported  to  date  in  the  literature),  of  50  fold  for  even  a  mod¬ 
est  decimation  factor  of  eight.  Unlike  the  Cunningham  and  Williams  baseline  algorithm, 
the  SVD  MRTFD  and  CC  MRTFD  do  not  trade  accuracy  for  speed.  As  the  decimation  fac¬ 
tor  associated  with  the  MRTFD  increases,  the  parallelization  of  the  algorithm  increases 
and,  as  a  result,  the  potential  throughput  of  the  MRTFD  increases  [33]. 

The  SVD  MRTFD  is  based  upon  the  inner  product  formulation  of  the  GDTFD  and  is 
based  upon  the  Singular  Value  Decomposition  of  the  kernel.  It  demonstrates  the  capability 
to  produce  the  GDTFD  for  a  single  instant  of  time  significantly  faster  than  any  existing  al¬ 
gorithm.  The  improvement  is  strictly  a  function  of  the  decimation  factor  and  the  number  of 
parallel  processing  paths  available. 

The  SVD  MRTFD  has  the  advantage  of  being  able  to  allow  engineering  trade-offs 
between  the  number  of  processors  in  an  implementation,  the  throughput  of  the  system  and 
the  accuracy  of  the  GDTFD  produced.  If  there  are  insufficient  parallel  processing  paths  to 
implement  the  SVD  MRTFD  using  every  singular  value,  then  an  approximation  can  be 
made  to  reduce  the  number  of  parallel  paths  without  increasing  the  time  it  takes  to  com¬ 
puted  the  GDTFD  via  the  SVD  MRTFD.  If  each  sub-kernel  in  the  SVD  MRTFD  is  ap¬ 
proximated  by  limiting  the  condition  number  of  the  sub-kernels,  it  is  possible  to  reduce  the 
number  of  Short-Ume  Fourier  Transforms  (STFT)  necessary  to  implement  the  SVD 
MRTFD.  As  the  condition  number  is  decreased,  the  approximation  to  the  GDTFD  will  be¬ 
come  less  precise,  but  the  number  of  parallel  processing  paths  will  shrink.  The  exact  re¬ 
duction  is  dependent  upon  the  specific  kernel  being  used. 
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The  CC  MRTFD  is  based  upon  the  outer  product  form  of  the  GDTFD  and  calculates 
N  instants  in  tinw  simultaneously.  It  does  this  faster  than  any  existing  algorithm  including 
the  SVD  MRTFD.  Compared  to  the  Cunningham  and  Williams  baseline  algorithm,  the  CC 
MRTFD  is  on  the  order  two  orders  of  magnitude  faster  given  several  hundred  parallel  pro¬ 
cessing  paths.  The  CC  MRTFD  has  the  advantage  of  quickly  calculating  the  exact  GDTFD 
but  must  do  so  in  blocks  N  long. 

To  create  the  MRTFD,  it  was  necessary  to  create  another  new  time-frequency  analy¬ 
sis  tool — the  Decimated  Generalized  Discrete  Time-Frequency  Distribution  (D-GDTFD). 
Its  development  was  the  culmination  of  the  development  of  a  new  collection  of  Zak  trans¬ 
form  based  time-frequency  analysis  tools.  First,  it  was  shown  that  the  discrete  Zak  trans¬ 
form,  with  the  addition  of  a  window,  becomes  a  generalization  of  the  STFT.  This  new 
transform  was  called  the  Windowed  Zak  Transform  (WZT).  Building  upon  the  WZT,  the 
Zak-Spectrogram  (ZS)  was  created  and  shown  to  be  a  generalization  of  the  spectrogram. 
Lastly,  the  ZS  was  used  in  combination  with  the  method  of  weighted  spectrograms  (via 
both  eigenvalue  decomposition  and  SVD)  to  create  D-GDTFD.  These  distributions  trade 
bandwidth  for  speed.  For  a  decimation  factor  of  m,  there  is  an  m  fold  increase  in  through¬ 
put  (or  speed  of  calculation).  The  corresponding  reduction  in  discrete  bandwidth  is  from 
2ii  for  the  GDTFD  to  2n/m  for  the  D-GDTFD.  An  important  attribute  of  the  D-GDTFD 
is  that  it  requires  significantly  less  storage  than  the  GDTFD.  The  D-GDTFD  requires  only 
1/m^  of  the  storage  of  the  GDTFD  [35]  [38]. 

To  allow  the  use  of  alias-free  kernels  designed  in  either  the  ambiguity  or  time-lag  do¬ 
mains,  new  methods  were  created  to  allow  the  easy  (and  fast)  numerical  calculation  of  ker¬ 
nels  in  either  domain  from  a  kernel  defined  in  tite  other  domain  regardless  of  the  plane  in 
which  it  was  designed  [36][37]. 
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An  interesting  area  of  research  which  could  now  be  done  is  to  examine  the  problem 
of  adaptive  kernels  from  the  perspective  of  multirate  as  the  computational  paradigm.  Cur¬ 
rent  adt^tive  kernel  techniques  require  the  repeated  calculation  of  the  entire  distribution  as 
the  kernel  is  iteratively  adapted  to  the  signal.  The  Multirate  Time-Frequency  Distribution 
approach  suggests  the  notion  of  selectively  adapting  the  decimated  kernels.  Thus,  the  size 
of  each  optimization  being  performed  is  reduced  permitting  increased  performance  for 
each  iteration  of  the  kernel.  Perhaps  an  interesting  variation  of  this  idea  would  be  to  adapt 
individual  window  functions  in  either  the  weighted  spectrogram  or  S  VD  approach  instead 
of  adapting  the  entire  kernel. 

A  second  area  for  further  research  is  the  application  of  multirate  to  the  multidimen¬ 
sional  Time-Frequency  Distribution.  The  throughput  problems  encountered  with  the  Time- 
Frequency  Distributions  of  one  dimensional  signal  are  greatly  magnified  with  the  addition 
of  more  dimensions  in  the  input  signal.  Multirate,  in  this  case  multidimensional  multirate, 
provides  one  approach  to  improving  the  performance  of  a  multidimensional  TFD.  One  of 
the  benefits  of  multidimensional  multirate  is  the  ability  to  select  optimum  sampling  struc¬ 
tures  or  latices  that  were  not  available  in  the  one  dimensional  case.  This  can  greatly  in¬ 
crease  the  efficiency  of  the  algorithm  reducing  the  overall  cost,  and  the  resulting  algorithm 
can  also  be  implemented  in  a  parallel  architecture  to  obtain  the  type  of  performance  bene¬ 
fits  detailed  in  this  dissertation. 


Appendix  A.  Spectrograms  and  C  DTFD’s 


The  spectrogram  is  given  by 


5,(/,(0)  =  |F,(Q))f  = 
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Using  the  transforms  pair  given  in  [1 1],  x*  (~t)  <->  X*  (©)  ,  allows  (1)  to  be  rewritten  as 
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Note  that  the  convolution  and  the  Fourier  transform  have  the  following  relationship 

oo 
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Applying  the  Fourier  transform  to  the  right  hand  side  of  (3),  leads  to  an  equation  of  the 
form  found  in  (2)  implying  the  spectrogram  can  be  rewritten  as 


SfU,<o)  =  J 


-j2nx(0 


dx  (4) 


Applying  a  change  of  variables  to  the  interior  integral,  (4)  becomes 


(5) 


Compare  this  result  with  the  convolutional  form  of  the  Generalized  Time-Frequency  Dis¬ 
tribution  using  a  kernel  based  upon  the  bilinear  form  of  the  window,  w(0. 
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oo  oe 

Cfit,<o,W)  =  J  j  Ry(u,x)Wit-u,x)  (6) 


where /?y.(M,  t)  =  /^u  +  |  j/*^u  -  ^  j  and  IV(m,  x)  =  +  Placing 

the  definitions  into  (6)  yields, 


— oo  —OO  \  / 

Equations  (5)  and  (7)  bear  a  marked  similarity.  The  only  difference  lies  in  the  vari¬ 
ables  for  h  and  w.  On  inspection,  it  can  be  seen  that  the  variables  of  /t  are  the  negative  of 
w*  and  the  variables  of  A*  are  the  negative  of  w.  This  implies  that  if  h  is  symmetric  or  anti¬ 
symmetric  and  equal  to  w,  then  (5)  and  (7)  are  identical.  In  practice,  this  is  not  a  severe 
restriction  on  the  windows  used  for  spectrograms. 


A.2 


BIBLIOGRAPHY 


[  1  ]  S.  G.  Akl,  The  Design  and  Analysis  of  Parallel  Algorithms.  Prentice  Hall,  Engle¬ 
wood  Cliffs,  New  Jersey,  1989. 

[2]  Amin,  M.  G.,  “Performance  Comparison  of  Wigner-Ville  Spectrum  Estimators 
Using  Least-Squares  Approximation  of  Kernels,”  Proc.  ISSPA,  Gold  Coast  Austra¬ 
lia,  vol.  2,  1990. 

[3]  - ,  ‘Time-Frequency  Spectrum  Analysis  and  Estimation  for  Non-Station- 

ary  Random  Processes,”  in  Time-Frequencv  Signal  Analysis:  Methods  and  Appli¬ 
cations.  B.  Boashash  (ed.),  Longman  Cheshire,  Melbourne,  Australia,  1992. 

[4]  E.  Anderson,  et  al,  LAPACK  User’s  Guide.  SIAM,  Philadelphia,  1992. 

[5]  F.  Auger,  “Some  Simple  Parameter  Determinations  Rules  for  the  Generalized 
Choi- Williams  and  Butterworth  Distributions,”  IEEE  Signal  Processing  Letters, 
Vol.  1,  No.  1,  pp  9-1 1,  January  1994. 

[6]  L.  Auslander,  I.  C.  Gertner  and  R.  Tolimieri,  ‘The  Discrete  Zak  Transform  Appli¬ 
cation  to  Time-Frequency  Analysis  and  Synthesis  of  Nonstationary  Signals,”  IEEE 
Trans,  on  Sig.  Proc.,  Vol.  39,  No.  4,  pp.  825-835,  April  1991 

[7]  L.  Auslander  and  R.  Tolimieri,  “Is  Computing  with  Finite  Fourier  Transforms  Pure 
or  Applied  Mathematics,”  Bulletin  of  the  American  Mathematical  Society,  Vol.  1 , 
pp  847-897, 1979. 

[8]  B.  Boashash,  ed.  Time-Frequency  Signal  Analysis:  Methods  and  Applications. 
John  Wiley  &  Sons,  New  York,  1992. 

[9]  B.  Boashash  and  A.  Reilly,  “Algorithms  for  Time-Frequency  Signal  Analysis,” 
Time-Frequency  Signal  Analysis:  Methods  and  Applications,  ed.  B.  Boashash, 
John  Wiley  &  Sons,  New  York,  1992. 

[10]  P.  J.  Boles  and  B.  Boashash,  “Application  of  the  Cross- Wigner-Ville  Distribution 
to  Seismic  Data  Processing,”  Time-Frequency  Signal  Analysis:  Methods  and 
Applications,  ed.  B.  Boashash,  John  Wiley  &  Sons,  New  York,  1992. 

[11]  R.  N.  Bracewell,  The  Fourier  Transform  and  Its  Applications.  McGraw-Hill  Book 
Company,  New  York,  1978. 

[12]  G.  Brassard  and  P.  Bratley,  Algorithmics:  Theory  and  Practice.  Prentice  Hall, 
Englewood  Cliffs,  New  Jersey,  1988. 

[13]  T.  Brotherton,  T.  Pollard  and  D.  Jones,  “Application  of  Time-Frequency  and  Time- 
Scale  Representations  to  Fault  Detection  and  Classification,”  IEEE  International 


Bib.  I 


Symposium  on  Time-Frequency  and  Time-Scale  Analysis,  pp.  95-98,  October 
1992. 

[14]  T.  Chen  and  P.  P.  Vaidyanathan,  “Recent  Developments  in  Multidimensional  Mul¬ 
tirate  Systems,”  IEEE  Transactions  on  Circuits  and  Systems  for  Video  Technology, 
Vol.  3,  No.  2,  pp.  116-1 37,  April  1 993. 

[15]  T.  A.  C.  M.  Classen  and  W.  F.  G.  Mecklenbrauker,  “The  Wigner  Distribution — A 
Tool  for  Time-Frequency  Signal  Analysis,  Part  III:  Relations  With  Other  Time-Fre¬ 
quency  Signal  Transforms,”  Philips  J.  Res.,  Vol.  35,  No.  6,  pp.  372-389,  1980. 

[16]  L.  Cohen,  “Time-Frequency  Distributions-A  Review,”  Proceedings  of  the  IEEE, 
Vol.  77,  No.  7,  pp.  941-981,  1989. 

[17]  - ,  “Generalized  Phase-Space  Distribution  Functions,”  J.  Math.  Phys.,  Vol. 

7,pp.  781-786,  1966. 

[18]  G.  S.  Cunningham  and  W.  J.  Williams,  “Kernel  Decomposition  of  Time-Frequency 
Distributions,”  IEEE  Transactions  on  Signal  Processing,  Vol.  42,  No.  6,  June  1994. 

[  1 9]  - ,  “Fast  Implementation  of  Time-Frequency  Distributions,”  IEEE  Interna¬ 

tional  Symposium  on  Time-Frequency  and  Time-Scale  Analysis,  pp  241-244,  Octo¬ 
ber  1992. 

[20]  J.  J.  Dongarra,  J.  Du  Croz,  S.  Hairanarling  and  L  DulT,  “A  Set  of  Level  3  Basic 
Linear  Algebra  Subprograms,”  ACM  Transactions  on  Mathematical  Software,  Vol. 
16,  pp.  1-17,  1990. 

[21]  J.  J.  Dongarra,  J.  Du  Croz,  S.  Hammarling  and  R.  Hanson,  “An  Extended  Set  of 
Fortran  Basic  Linear  Algebra  Subprograms,”  ACM  Transactions  on  Mathematical 
Software,Wo\.  14,  pp.  1-17,  1988. 

[22]  J.  Fang,  L.  Atlas  and  G.  Bernard,  “Advantages  of  Cascaded  Quadratic  Detectors 
for  Analysis  of  Manufacturing  Sensor  Data,”  IEEE  International  Symposium  on 
Time-Frequency  and  Time-Scale  Analysis,  pp.  345-348,  October  1992. 

[23]  B.  D.  Forrester,  “Time-Frequency  Analysis  in  Machine  Fault  Detection,”  in  Time- 
Frequenev  Signal  Analysis:  Methods  and  Applications,  ed.  B.  Boashash,  John 
Wiley  &  Sons,  New  York,  1992. 

[24]  G.  H.  Golub  and  C.  F.  Van  Loan,  Matrix  Computations.  2nd  ed.,  Johns-Hopkins 
University  Press,  Baltimore,  Maryland,  1991. 

[25]  J,  Granata,  M.  Conner  and  R.  Tolimieri,  “Recursive  Fast  Algorithms  and  the  Role 
of  the  Tensor  Product,”  IEEE  Transactions  on  Signal  Processing,  Vol.  40,  No.  12, 
pp.292 1-2930,  December  1992. 


Bib.2 


[26]  Hlawatsch,  F.,  “Regularity  and  Unitarity  of  Bilinear  Time-Frequency  Signal  Rep¬ 
resentations,’’  IEEE  Trans,  on  Information  Theory,  Vol.  38,  No.  1,  pp.  82-94,  Janu¬ 
ary  1992. 

[27]  K.  Hwang  and  F.  Briggs,  Computer  Architecture  and  Parallel  Processing. 
McGraw-Hill,  New  York,  1984. 

[28]  A.  J.  E.  M.  Jansen,  “The  Zak  Transform:  A  Signal  Transform  for  Sampled  Time- 
Continuous  Signals,’’  Philips  J.  Res.,  Vol.  43,  pp.  23-69,  1988. 

[29]  J.  Jeong  and  W.  J.  Williams,  “Alias-Free  Generalized  Discrete-Time  Time-Fre¬ 
quency  Distributions,”  IEEE  Transactions  on  Signal  Processing,  Vol.  40,  No.  1 1 , 
pp.  2757-2765,  1992. 

[30]  - ,  “A  New  Formulation  of  Generalized  Discrete-Time  Time-Frequcocy 

Distributions,”  Proc.  International  Conference  on  Acoustics,  Speech  and  Signal 
Processing,  pp.  3189-3192,  March  1991. 

[31]  R.  Koenig,  H.  Dunn  and  L.  Lacy,  “The  Sound  Spectrograph,”  J.  Acoust.  Soc. 
Anttr.,  Vol.  18,  pp  19-49,  1946. 

[32]  Neng -Chung  Hu,  Hong-I  Chang  and  O.  K.  Ersoy,  “Generalized  Discrete  Hartley 
Transforms,”  IEEE  Trans,  on  Signal  Processing,  Vol.  40,  No,  12,  December  1992. 

[33]  J.  R,  O’Hair  and  B  W,  Suter,  “Multirate:  A  New  Computational  Paradigm,”  Sub¬ 
mitted  to  IEEE  Transactions  on  Signal  Processing,  May  1994. 

[34]  - ,  “Multirate:  A  New  Computational  Paradigm,”  Submitted  to  IEEE 

Transactions  on  Signal  Processing,  May  1994. 

[35]  - ,  ‘The  Zak  Transform  and  Decimated  Time-Frequency  Distributions,” 

Submitted  to  IEEE  Transactions  on  Signal  Processing,  December  1993. 

[36]  - ,  “Kernel  Design  Techniques  for  Alias-Free  Time-Frequency  Distribu¬ 

tions,”  Submitted  to  IEEE  Transactions  on  Signal  Processing,  July  1993. 

[37]  - ,  “Kernel  Design  Techniques  for  Alias-Free  Time-Frequency  Distribu¬ 

tions,”  IEEE  International  Conference  on  Acoustic,  Speech  and  Signal  Processing, 
Adelaide,  Australia,  Vol.  HI,  pp  333-336,  19-22  April  1994. 

[38]  - ,  “The  Zak  Transform  and  Decimated  Spectrograms,”  IEEE  International 

Symposium  on  Circuits  and  Systems,  London,  U.K.,  30  May-2  June  1994. 

[39]  A.  V.  Oppenheim  and  R.  W.  Schafer,  Discrete-Time  Signal  Processing.  Prentice- 
Hall,  Englewood  Cliffs,  NJ,  1989. 


[40]  A.  Papandreou  and  G.  F.  Boudreaux-Bartels,  “Generalization  of  the  Choi-Williams 


Distribution  and  the  Butterworth  Distribution  for  Time-Frequency  Analysis,”  IEEE 
Transactions  on  Signal  Processing,  Vol.  41,  No.  1,  pp.  463-472,  1993. 


[41]  S.  Qian  and  D.  Chen,  “Discrete  Gabor  Transform,”  IEEE  Trans,  on  Sig.  Proc.,  Vol. 
41,  No.  7,  pp.  2429-2438,  July  1993. 

[42]  K.  R.  Rao  and  P.  Yip,  Discrete  Cosine  Transform:  Algorithms.  Advantages.  Appli¬ 
cations.  Academic  Press,  Boston,  1990. 

[43]  P.  A.  Regalia  and  S.  K.  Mitra,  “Kronecker  Products,  Unitary  Matrices  and  Signal 
Processing  Applications,”  SIAM  Review,  Vol.  31,  No.  4,  pp.  586-613,  December 
1989. 

[44]  C.  Runge,  Zeit.  fur  Math,  und  Phvsik.  48  (1903)  943. 

[45]  V.  P.  Sathe  and  P.  P.  Vaidyanathan,  “Effects  of  Multirate  Systems  on  Statistical 
Properties  of  Random  Signals,”  IEEE  Transactions  on  Signal  Processing,  Vol.  41, 
No.  l,pp.  131-146,  January  1993. 

[46]  H.  V.  Sorensen,  M.  T.  Heideman  and  C.  S.  Burrus,  “On  Computing  the  Split-Radix 
FFT,”  IEEE  Trans.  Acoust.,  Speech  and  Sig.  Proc.,  Vol.  34,  No.  1,  pp.  152-156, 
February  1986. 

[47]  H.  V.  Sorensen,  D.  L.  Jones,  C.  S.  Burrus  and  M.  T.  Heideman,  “On  Computing  the 
Discrete  Hartley  Transform,”  IEEE  Trans.  onASSP,  Vol.  33,  pp.  1231-1238,  Octo¬ 
ber,  1985. 

[48]  G.  W.  Stewart,  Introduction  to  Matrix  Computations.  Academic  Press,  Orlando, 
FL,  1973. 

[49]  P.  L.  lyack,  W.  J.  Williams  and  G.  Cunningham,  ‘Time-Frequency  Fine  Structure 
of  Dolphin  Whistles,”  IEEE  International  Symposium  on  Time-Frequency  and 
Time-Scale  Analysis,  pp.  17-20,  October  1992. 

[50]  P.  P.  Vaidyanathan,  Multirate  Systems  and  Filter  Banks.  Prentice  Hall,  Englewood 
Cliffs,  New  Jersey,  1993. 

[51]  C.  Van  Loan,  Computational  Frameworks  for  the  Fast  Fourier  Transform.  SIAM, 
Philadelphia,  1992. 

[52]  E.  F.  Vilez  and  H.  Garudadri,  “Speech  Analysis  Based  on  Smoothed  Wigner-Ville 
Distribution,”  in  Time-Frequencv  Signal  Analysis:  Methods  and  Applications,  ed. 
B.  Boashash,  John  Wiley  &  Sons,  New  York,  1992. 

[53]  V.  Ville,  “Sur  la  Notion  de  Signal  Analytique,”  Cables  et  Transmissions,  tome  2, 
No.  1,  pp.  61-74,  1948. 


Bib.4 


[54]  L.  B.  White,  “Transition  Kernels  for  Bilinear  Time-Frequency  Distributions,” 
IEEE  Trans,  on  Sig.  Proc.,  Vol.  39,  No.  2,  pp.  542-544,  February  1991 

[55]  E.  Wigner,  “On  the  Quantum  Correction  for  Thermodynamic  Equilibrium,”  Phys. 
Rev..  40  (1932)  749-759. 

[56]  W.  J.  Williams  and  J.  Jeong,  “Reduced  Interference  Time-Frequency  Distribu¬ 
tions,”  in  Time-Frequency  Signal  Analysis:  Methods  and  Applications.  B. 
Boashash  (ed.),  Longman  Cheshire,  Melbourne,  Australia,  1992. 

[57]  Zak,  J.,  “Finite  Translations  in  Solid  State  Physics,”  Ph\s.  Rev.  Lett.  19,  pp.  1385- 
1397,  1967. 


Bib.S 


Captain  John  R.  O’Hair  was  bom  at  West  Point,  New  York,  on  16  November  1961 . 
He  graduated  from  Butte  Central  High  School  in  Butte,  Montana  in  May  of  1980,  and 
entered  the  United  States  Air  Force  Academy  the  following  June.  He  graduated  with  a 
Bachelor  of  Science  in  Electrical  Engineering  in  1984  and  was  assigned  to  Headquarters 
Armament  Division  as  an  Intelligence  Analyst.  During  the  next  two  years,  he  performed 
technical  analysis  of  radar  and  Surface-to-Air  Missile  systems  and  attended  night  school 
at  the  University  of  West  Florida  (UWF).  He  earned  a  Masters  of  Business  Administration 
from  UWF  in  1986.  That  same  year,  he  entered  Texas  Tech  University  in  Lubbock,  Texas, 
under  the  auspices  of  the  Air  Force  Institute  of  Technology’s  Civilian  Institute  Program. 
He  earned  a  Master  of  Science  in  Electrical  Engineering  from  Texas  Tech  University  in 
1987  in  the  area  of  pulse  power  solid  state  devices.  Upon  graduation,  he  was  assigned  to 
the  Foreign  Technology  Division  at  Wright-Patterson,  ATO,  Ohio,  where  he  was  responsi¬ 
ble  for  the  procurement  of  signal  processing  and  analysis  systems.  He  entered  the  Air 
Force  Institute  of  Technology  School  of  Engineering  in  July  1991 . 


